Skip to main content

14.4 Collectors In Depth

The Collectors class provides collection strategies used with the collect() terminal operation. Beyond basic toList() and joining(), it offers powerful features for grouping and aggregating data.

1. Basic Collectors Review

import java.util.*;
import java.util.stream.*;

List<String> fruits = List.of("apple", "banana", "cherry", "apricot", "blueberry");

// toList() - Java 16+ short form vs Collectors.toList()
List<String> list1 = fruits.stream().collect(Collectors.toList()); // mutable
List<String> list2 = fruits.stream().toList(); // immutable (Java 16+)

// toSet()
Set<String> set = fruits.stream().collect(Collectors.toSet());

// joining - concatenate into a string
String joined = fruits.stream().collect(Collectors.joining(", "));
// "apple, banana, cherry, apricot, blueberry"

String withBrackets = fruits.stream()
.collect(Collectors.joining(", ", "[", "]"));
// "[apple, banana, cherry, apricot, blueberry]"

2. groupingBy — Grouping

Groups data into a Map by a specified criterion.

record Person(String name, int age, String city, double salary) {}

List<Person> people = List.of(
new Person("Alice", 28, "Seoul", 5000),
new Person("Bob", 35, "Busan", 6000),
new Person("Charlie", 28, "Seoul", 7000),
new Person("Dave", 35, "Seoul", 4500),
new Person("Eve", 22, "Daegu", 3500)
);

// Group by city
Map<String, List<Person>> byCity = people.stream()
.collect(Collectors.groupingBy(Person::city));
// {Seoul=[Alice, Charlie, Dave], Busan=[Bob], Daegu=[Eve]}

byCity.forEach((city, persons) -> {
System.out.println(city + ": " + persons.stream()
.map(Person::name).collect(Collectors.joining(", ")));
});

// Group by age + extract names only (downstream collector)
Map<Integer, List<String>> namesByAge = people.stream()
.collect(Collectors.groupingBy(
Person::age, // classification key
Collectors.mapping(Person::name, Collectors.toList()) // downstream
));
System.out.println(namesByAge);
// {22=[Eve], 28=[Alice, Charlie], 35=[Bob, Dave]}

// Count per group
Map<String, Long> countByCity = people.stream()
.collect(Collectors.groupingBy(Person::city, Collectors.counting()));
System.out.println(countByCity); // {Seoul=3, Busan=1, Daegu=1}

// Average salary per group
Map<String, Double> avgSalaryByCity = people.stream()
.collect(Collectors.groupingBy(Person::city,
Collectors.averagingDouble(Person::salary)));
System.out.println(avgSalaryByCity); // {Seoul=5500.0, Busan=6000.0, Daegu=3500.0}

Multi-level Grouping

// Group by city, then by age (nested grouping)
Map<String, Map<Integer, List<Person>>> byCityAndAge = people.stream()
.collect(Collectors.groupingBy(Person::city,
Collectors.groupingBy(Person::age)));

byCityAndAge.forEach((city, ageMap) -> {
System.out.println("=== " + city + " ===");
ageMap.forEach((age, persons) ->
System.out.println(" Age " + age + ": " + persons.stream()
.map(Person::name).collect(Collectors.joining(", "))));
});

3. partitioningBy — Splitting into Two Groups

Classifies elements into two groups (true/false) based on a boolean condition.

// Separate high earners (salary >= 5000) from low earners
Map<Boolean, List<Person>> partitioned = people.stream()
.collect(Collectors.partitioningBy(p -> p.salary() >= 5000));

System.out.println("High earners: " + partitioned.get(true).stream()
.map(Person::name).collect(Collectors.joining(", ")));
// High earners: Alice, Bob, Charlie

System.out.println("Low earners: " + partitioned.get(false).stream()
.map(Person::name).collect(Collectors.joining(", ")));
// Low earners: Dave, Eve

// Partition numbers (primes vs composites)
List<Integer> nums = IntStream.rangeClosed(2, 20).boxed().toList();
Map<Boolean, List<Integer>> primes = nums.stream()
.collect(Collectors.partitioningBy(n -> {
for (int i = 2; i * i <= n; i++)
if (n % i == 0) return false;
return true;
}));
System.out.println("Primes: " + primes.get(true));
// Primes: [2, 3, 5, 7, 11, 13, 17, 19]

4. toMap — Converting to a Map

// toMap(keyMapper, valueMapper)
Map<String, Double> nameSalaryMap = people.stream()
.collect(Collectors.toMap(
Person::name, // key
Person::salary // value
));
// {Alice=5000.0, Bob=6000.0, ...}

// Handling merge conflicts (duplicate keys)
Map<String, Double> avgSalaryMap = people.stream()
.collect(Collectors.toMap(
Person::city,
Person::salary,
(existing, replacement) -> (existing + replacement) / 2 // average
));

// Preserve insertion order (collect into LinkedHashMap)
Map<String, Double> orderedMap = people.stream()
.collect(Collectors.toMap(
Person::name,
Person::salary,
(a, b) -> a, // keep first value on conflict
LinkedHashMap::new // specify Map implementation
));

5. Statistics Collectors

// summarizingDouble - get multiple statistics in one pass
DoubleSummaryStatistics stats = people.stream()
.collect(Collectors.summarizingDouble(Person::salary));

System.out.println("Count: " + stats.getCount()); // 5
System.out.println("Sum: " + stats.getSum()); // 26000.0
System.out.println("Average: " + stats.getAverage()); // 5200.0
System.out.println("Min: " + stats.getMin()); // 3500.0
System.out.println("Max: " + stats.getMax()); // 7000.0

// Statistics per group
Map<String, DoubleSummaryStatistics> statsByCity = people.stream()
.collect(Collectors.groupingBy(Person::city,
Collectors.summarizingDouble(Person::salary)));

6. Custom Collector

You can build your own collector (advanced feature).

// Collect a word list into a frequency map
List<String> words = List.of("hello", "world", "hello", "java", "world", "hello");

// Option 1: Collectors.toMap + merge function
Map<String, Long> wordCount = words.stream()
.collect(Collectors.groupingBy(w -> w, Collectors.counting()));
System.out.println(wordCount); // {hello=3, world=2, java=1}

// Option 2: Collector.of for a custom collector
Collector<String, Map<String, Integer>, Map<String, Integer>> wordFreqCollector =
Collector.of(
HashMap::new, // supplier
(map, word) -> map.merge(word, 1, Integer::sum), // accumulator
(m1, m2) -> { m1.putAll(m2); return m1; }, // combiner
Collector.Characteristics.IDENTITY_FINISH // characteristics
);

Map<String, Integer> freq = words.stream().collect(wordFreqCollector);
System.out.println(freq); // {hello=3, world=2, java=1}

Pro Tips

Getting a sorted Map from groupingBy:

// TreeMap automatically sorts keys
Map<String, Long> sorted = people.stream()
.collect(Collectors.groupingBy(
Person::city,
TreeMap::new, // specify Map implementation
Collectors.counting()
));
// {Busan=1, Daegu=1, Seoul=3} (alphabetical order)

Collectors.teeing (Java 12+): processes a stream with two collectors simultaneously and merges the results.

// Get min and max in a single pass
record MinMax(double min, double max) {}

MinMax result = people.stream()
.collect(Collectors.teeing(
Collectors.minBy(Comparator.comparingDouble(Person::salary)),
Collectors.maxBy(Comparator.comparingDouble(Person::salary)),
(min, max) -> new MinMax(
min.map(Person::salary).orElse(0.0),
max.map(Person::salary).orElse(0.0)
)
));
System.out.println(result); // MinMax[min=3500.0, max=7000.0]