如何在Java 8中从学生列表中找到最受欢迎的n项运动?

3

我希望找出学生中最受欢迎的前N项运动。其中,N是一个参数。

我分三步完成了这个任务,但并不满意。我正在尝试将其优化为一步完成。

以下是我的完整代码和解决方案:

public class Person {

    private UUID id;
    private String name;
    private List<Sport> sports = new ArrayList<>();

   //getter and setters + constructor
}

这是“体育课”:
public class Sport {

    private String name;

    public Sport(String name) {
        this.name = name;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }
}

这是我的数据以及提取前三最受欢迎的运动的逻辑:
    public static void main(String[] args) {

        // most popular sports

        Sport football = new Sport("Football");
        Sport tennis = new Sport("Tennis");
        Sport basketBall = new Sport("BasketBall");
        Sport handball = new Sport("Handball");
        Sport swimming = new Sport("Swimming");
        Sport running = new Sport("Running");
        Sport climbing = new Sport("Climbing");


        List<Person> people = new ArrayList<>();
        people.add(new Person(UUID.randomUUID(), "Bob", Arrays.asList(football, handball)));
        people.add(new Person(UUID.randomUUID(), "Tom", Arrays.asList(football, basketBall, tennis)));
        people.add(new Person(UUID.randomUUID(), "Tim", Arrays.asList(climbing, handball, football)));
        people.add(new Person(UUID.randomUUID(), "Marc", Arrays.asList(football, basketBall)));
        people.add(new Person(UUID.randomUUID(), "Gerard", Arrays.asList(tennis, handball)));
        people.add(new Person(UUID.randomUUID(), "Claudia", Arrays.asList(running, handball)));
        people.add(new Person(UUID.randomUUID(), "Sara", Arrays.asList(football, climbing)));
        people.add(new Person(UUID.randomUUID(), "Laura", Arrays.asList(football)));
        people.add(new Person(UUID.randomUUID(), "Mo", Arrays.asList(football, tennis)));


        //Step 1 - Merge all the sports lists of all students
        List<Sport> allSports = new ArrayList<>();
        for (Person person : people) {
            allSports.addAll(person.getSports());
        }

        // Step 2 - Transfor into a Map with groupBy and count
        Map<Sport, Long> collect = allSports.stream().collect(groupingBy(Function.identity(), counting()));

        // Return top 3 most popular sports
        collect.entrySet().stream()
                .sorted(Map.Entry.<Sport, Long>comparingByValue().reversed())
                .limit(3)
                .forEach(s -> System.out.println(s.getKey().getName()));

    }

输出:

 Football
 Handball
 Tennis

你的解决方案有什么问题? - Andronicus
安德罗尼库斯,这太啰嗦了...也许需要更优雅或更简短的东西。而且,我不太喜欢将所有列表合并成一个。如果我有数百万条目,这样做还可以吗? - ErEcTuS
2个回答

7
一个单一的流程将如下所示:
people.stream()
        .flatMap(a -> a.getSports().stream()) // step 1 (stream of Sport)
        .collect(groupingBy(Function.identity(), counting())) // step 2 (map with count)
        .entrySet().stream()
        .sorted(Map.Entry.<Sport, Long>comparingByValue().reversed())
        .limit(3)
        .map(entry -> entry.getKey().getName()) // mapped to speficic type before accessing
        .forEach(System.out::println); // step 3 (print top N entry names)

嘿,Naman,太棒了,谢谢!还有一个问题,请问是否有办法返回首选项中前n个的数组或列表,而不是使用System.out::println? - ErEcTuS
1
@Naman 针对仅有 n 个元素(远少于总数)进行了相当多的排序... - Eugene
@ErEcTuS 要获取一个列表,请将最后的 forEach 替换为 .collect(Collectors.toList()) - Comencau
@Eugene,说得好。如果我有数百万条记录,这将很重要。我不想在只需要前5个的情况下对它们进行排序。 - ErEcTuS
1
@Naman的建议只是提供了一种替代方案,如果需要,这已经是朝着正确方向迈出的一步。 - Eugene
显示剩余2条评论

2
如果您愿意使用第三方库,可以使用Eclipse Collections中的countByEachtopOccurrences方法。
MutableList<Person> people = Lists.mutable.with(
        new Person(UUID.randomUUID(), "Bob", football, handball),
        new Person(UUID.randomUUID(), "Tom", football, basketBall, tennis),
        new Person(UUID.randomUUID(), "Tim", climbing, handball, football),
        new Person(UUID.randomUUID(), "Marc", football, basketBall),
        new Person(UUID.randomUUID(), "Gerard", tennis, handball),
        new Person(UUID.randomUUID(), "Claudia", running, handball),
        new Person(UUID.randomUUID(), "Sara", football, climbing),
        new Person(UUID.randomUUID(), "Laura", football),
        new Person(UUID.randomUUID(), "Mo", football, tennis));

MutableList<String> top3Names =
        people.countByEach(Person::getSports)
                .topOccurrences(3)
                .collect(pair -> pair.getOne().getName());

MutableList<String> expected =
        Lists.mutable.with("Football", "Handball", "Tennis");

Assert.assertEquals(expected, top3Names);

类型MutableList扩展了List并添加了其他API。我简化了Person构造函数,使其接受一个变量参数数组Sport。方法countByEach返回一个MutableBag。方法topOccurrences返回一个MutableListObjectIntPairSport

您还可以使用Java Streams和Eclipse Collections的Collectors2实用程序类,如下所示:

List<String> top3Names = people.stream()
        .collect(Collectors2.countByEach(Person::getSports))
        .topOccurrences(3)
        .collect(pair -> pair.getOne().getName());

List<String> expected =
        Arrays.asList("Football", "Handball", "Tennis");

Assert.assertEquals(expected, top3Names);

注意:我是 Eclipse Collections 的贡献者。

谢谢Donald。我喜欢Eclipse的topOccurrences和countByEach。你的示例不那么冗长,可读性更好。当一些人拒绝使用第三方库时,你会怎么回答呢?我知道我的同事/团队领导会讨厌这个哈哈。 - ErEcTuS
唐纳德,还有一个问题。MutableList是否支持线程安全?我需要在add()操作中使用它吗? - ErEcTuS
社区中的一个回答:https://softwareengineering.stackexchange.com/questions/333372/is-there-any-disadvantage-to-using-eclipse-collections-exclusively - Donald Raab
您可以通过调用asSynchronized为MutableList返回一个同步视图。这将提供有条件的线程安全(迭代器需要注意)。或者,还有Lists.multiReader.with(...),它返回一个MultiReaderList,该列表扩展了MutableList。如果您不使用withReadLockAndDelegate或withWriteLockAndDelegate保护调用,则MultiReaderList在调用迭代器时会抛出异常。 - Donald Raab
如果您创建列表后不需要对其进行修改,并希望确保它不被修改,则可以使用 Lists.immutable.with() - Donald Raab

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接