Java 8 Streams多重分组

8

我有一个类似于这样的温度记录

dt        |AverageTemperature |AverageTemperatureUncertainty|City   |Country |Latitude|Longitude
----------+-------------------+-----------------------------+-------+--------+--------+---------
1963-01-01|-5.417000000000002 |0.5                          |Karachi|Pakistan|57.05N  |10.33E  
1963-02-01|-4.7650000000000015|0.328                        |Karachi|Pakistan|57.05N  |10.33E  
1964-01-01|-5.417000000000002 |0.5                          |Karachi|Pakistan|57.05N  |10.33E  
1964-02-01|-4.7650000000000015|0.328                        |Karachi|Pakistan|57.05N  |10.33E  
1965-01-01|11.417000000000002 |0.5                          |Karachi|Pakistan|57.05N  |10.33E 
1965-02-01|12.7650000000000015|0.328                        |Karachi|Pakistan|57.05N  |10.33E

我需要将其解析为POJO,并根据以下问题陈述计算平均增量:
使用Streams API计算每个国家的平均年温度增量。要计算增量,将1901年的平均温度减去1900年的平均温度,以获得特定城市从1900年到1901年的增量。所有这些增量的平均值是城市的平均年温度增量。一个国家所有城市的平均值是该国家的平均值。
我的Temperature POJO如下,具有getter和setter方法。
public class Temperature {
    private java.util.Date date;
    private double averageTemperature;
    private double averageTemperatureUncertainty;
    private String city;
    private String country;
    private String latitude;
    private String longitude;
}

我已经维护了一个温度列表,因为这个问题需要使用流来实现。

为了计算delta,我正在尝试使用以下流,但我仍然无法计算出实际的delta,因为我必须计算平均国家delta,所以我对国家、城市和日期进行了分组。

Map<String, Map<String, Map<Integer, Double>>> countriesMap = this.getTemperatures().stream()
                .sorted(Comparator.comparing(Temperature::getDate))
                .collect(Collectors.groupingBy(Temperature::getCountry,
                        Collectors.groupingBy(Temperature::getCity,
                        Collectors.groupingBy
                                (t -> {
                                            Calendar calendar = Calendar.getInstance();
                                            calendar.setTime(t.getDate());
                                            return calendar.get(Calendar.YEAR);
                                        }, 
                        Collectors.averagingDouble(Temperature::getAverageTemperature)))));

为了计算delta值,我们需要计算Map<Integer, Double>的差异。
为了计算差异,我想出了以下代码,但无法与上面的代码连接起来。
Stream.of(10d, 20d, 10d) //this is sample data that I that I get in `Map<Integer, Double>` of countriesMap
        .map(new Function<Double, Optional<Double>>() {
            Optional<Double> previousValue = Optional.empty();
            @Override
            public Optional<Double> apply(Double current) {
                Optional<Double> value = previousValue.map(previous -> current - previous);
                previousValue = Optional.of(current);
                return value;
            }
        })
        .filter(Optional::isPresent)
        .map(Optional::get)
        .forEach(System.out::println);

如何使用流一次性计算 delta,或者如何在 countriesMap 上执行流操作以计算 delta 并实现所述问题陈述?

1个回答

4
为了将问题陈述缩小到更小的块中,您可以尝试另一种方法,即解析年度温度并计算它们的差值,进而进行平均。但是,这必须针对您问题中内部Map中所有类型为Map的值进行操作。代码大致如下:
Map<Integer, Double> unitOfWork = new HashMap<>(); // innermost map you've attained ('yearToAverageTemperature' map)
unitOfWork = unitOfWork.entrySet()
        .stream()
        .sorted(Map.Entry.comparingByKey())
        .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e1, LinkedHashMap::new));
// the values sorted based on the year from a sorted map
List<Double> srtedValPerYear = new ArrayList<>(unitOfWork.values());
// average of deltas from the complete list 
double avg = IntStream.range(0, srtedVal.size() - 1)
        .mapToDouble(i -> (srtedVal.get(i + 1) - srtedVal.get(i)))
        .average().orElse(Double.NaN);

进一步说明,这只是一个城市记录<年份,平均温度>的平均值,您需要遍历所有的City键集和所有的Country键集,才能详尽地找出这样的平均值。
将这个工作单元进一步移入一个方法中,遍历整个映射表,可以这样完成:
// The average of all cities in a country is the average of a country.
AtomicReference<Double> countryValAvg = new AtomicReference<>(0.0);
countriesMap.forEach((country, cityMap) -> {
    // The average of all these deltas is the average annual temperature delta for a city.
    AtomicReference<Double> cityAvgTemp = new AtomicReference<>((double) 0);
    cityMap.forEach((city, yearMap) -> cityAvgTemp.set(cityAvgTemp.get() + averagePerCity(yearMap)));
    double avgAnnualTempDeltaPerCity = cityAvgTemp.get() / cityMap.size();

    countryValAvg.set(countryValAvg.get() + avgAnnualTempDeltaPerCity);
});
System.out.println(countryValAvg.get() / countriesMap.size());

这里的averagePerCity是一个方法,它执行以下操作:

double averagePerCity(Map<Integer, Double> unitOfWork) {
    unitOfWork = unitOfWork.entrySet()
            .stream()
            .sorted(Map.Entry.comparingByKey())
            .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e1, LinkedHashMap::new));
    List<Double> srtedVal = new ArrayList<>(unitOfWork.values());
    return IntStream.range(0, srtedVal.size() - 1)
            .mapToDouble(i -> (srtedVal.get(i + 1) - srtedVal.get(i)))
            .average().orElse(Double.NaN);
}

注意:上面的代码可能缺少验证,它只是为了提供如何将完整的问题分解为较小部分并解决的想法。

Edit1可以进一步改进

// The average of all cities in a country is the average of a country.
AtomicReference<Double> countryValAvg = new AtomicReference<>(0.0);
countriesMap.forEach((country, cityMap) -> {
    // The average of all these deltas is the average annual temperature delta for a city.
    double avgAnnualTempDeltaPerCity = cityMap.values()
            .stream()
            .mapToDouble(Quick::averagePerCity) // Quick is my class name
            .average()
            .orElse(Double.NaN);
    countryValAvg.set(countryValAvg.get() + avgAnnualTempDeltaPerCity);
});
System.out.println(countryValAvg.get() / countriesMap.size());

编辑2:并进一步至

double avgAnnualTempDeltaPerCity = countriesMap.values().stream()
        .mapToDouble(cityMap -> cityMap.values()
                .stream()
                .mapToDouble(Quick::averagePerCity) // Quick is my class name
                .average()
                .orElse(Double.NaN))
        .average().orElse(Double.NaN);

2
你可以使用double avgAnnualTempDeltaPerCity = cityMap.values().stream().mapToDouble(this::averagePerCity).average();来代替AtomicReference<Double> cityAvgTemp = new AtomicReference<>((double) 0); cityMap.forEach((city, yearMap) -> cityAvgTemp.set(cityAvgTemp.get() + averagePerCity(yearMap))); double avgAnnualTempDeltaPerCity = cityAvgTemp.get() / cityMap.size(); - Holger
@Holger double avgAnnualTempDeltaPerCity = cityAvgTemp.get() / cityMap.size();这不是avgAnnualTempDeltaPerCountry吗? - user2578909
@HammadNaeem 不,按国家划分的温度差异是在该迭代之外评估的。 - Naman
2
你可以对外部的map也进行简化;不用 AtomicReference<Double> countryValAvg = new AtomicReference<>(0.0); countriesMap.forEach((country, cityMap) -> { /* code not using country */ countryValAvg.set(countryValAvg.get() + avgAnnualTempDeltaPerCity); }) double result = countryValAvg.get() / countriesMap.size();,你可以直接使用 double result = countriesMap.values().stream().mapToDouble(cityMap -> /* expression used to initialize avgAnnualTempDeltaPerCity */).average(); - Holger

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接