Streams API

A stream is a lazy pipeline of operations on a sequence of elements — it describes what to do with data, not how to iterate through it.

What Problem Does It Solve?

Processing a list with classic for loops mixes the iteration logic with the business logic. Consider filtering, transforming, and collecting names in one operation:

// Imperative style — for loop mixes plumbing with intent
List<String> result = new ArrayList<>();
for (String name : names) {
    if (name.startsWith("A")) {
        result.add(name.toUpperCase());
    }
}

This is verbose, and it gets worse as operations compose. The Streams API lets you express the same intent declaratively:

// Declarative style — pipeline reads like a sentence
List<String> result = names.stream()
    .filter(name -> name.startsWith("A"))
    .map(String::toUpperCase)
    .collect(Collectors.toList());

Beyond clarity, streams also unlock lazy evaluation (skip work if not needed) and transparent parallelism — neither is straightforward with imperative loops.

What Is It?

A stream is a view over a data source that supports sequential or parallel aggregate operations. It is not a data structure — it doesn't store elements. Processing happens only when a terminal operation is invoked, and a stream can only be consumed once.

Every stream pipeline has three parts:

Source — where elements come from
Intermediate operations — lazy transformations (zero or more)
Terminal operation — triggers evaluation and produces a result (exactly one)

How It Works

Pipeline Anatomy

A stream pipeline — intermediate operations build up a chain of lazy transformations; the terminal operation triggers execution of the entire chain.

Lazy Evaluation

Intermediate operations are lazy — they return a new stream description, not computed data. Nothing runs until the terminal operation is called.

Stream<String> pipeline = names.stream()
    .filter(s -> {
        System.out.println("filtering: " + s); // ← never prints!
        return s.startsWith("A");
    })
    .map(String::toUpperCase);

// At this point: nothing has run yet
System.out.println("Before terminal");

List<String> result = pipeline.collect(Collectors.toList());
// Now filtering and mapping run

This means adding a limit(n) before a filter can skip all work for elements after position n.

Short-Circuiting Operations

Some terminal operations stop the pipeline early:

findFirst() — stops after the first match
anyMatch() — stops as soon as one element satisfies the predicate
noneMatch() — stops as soon as one element fails the predicate
limit(n) — intermediate, stops after n elements

boolean hasAdult = people.stream()
    .filter(p -> p.getAge() >= 18)
    .anyMatch(p -> p.hasLicense()); // ← stops after first adult with license

Stateless vs. Stateful Intermediate Operations

Category	Operations	Notes
Stateless	`filter`, `map`, `flatMap`, `peek`	Process each element independently; safe for parallel
Stateful	`sorted`, `distinct`, `limit`, `skip`	Need to see multiple/all elements; may block parallelism

Stream Sources

// From Collection
Stream<String> s1 = list.stream();
Stream<String> s2 = list.parallelStream();

// From array
Stream<Integer> s3 = Arrays.stream(new Integer[]{1, 2, 3});

// From values
Stream<String> s4 = Stream.of("a", "b", "c");

// Infinite stream — always use limit()!
Stream<Integer> naturals = Stream.iterate(1, n -> n + 1);
Stream<Double>  randoms  = Stream.generate(Math::random);

// Primitive streams — no boxing overhead
IntStream range    = IntStream.range(0, 10);      // [0, 9]
IntStream rangeClosed = IntStream.rangeClosed(0, 10); // [0, 10]
LongStream longs  = LongStream.of(1L, 2L, 3L);

Code Examples

Basic Pipeline

List<String> names = List.of("Alice", "Bob", "Anna", "Charlie", "Amy");

List<String> result = names.stream()
    .filter(n -> n.startsWith("A"))       // keeps: Alice, Anna, Amy
    .map(String::toLowerCase)              // alice, anna, amy
    .sorted()                              // stateful: buffers all, then sorts
    .collect(Collectors.toList());         // terminal: materializes the list
// result = [alice, amy, anna]

`flatMap` — Flattening Nested Structures

List<List<String>> nestedNames = List.of(
    List.of("Alice", "Bob"),
    List.of("Charlie", "Dave")
);

List<String> flat = nestedNames.stream()
    .flatMap(Collection::stream)     // ← flattens Stream<List<String>> into Stream<String>
    .collect(Collectors.toList());   // [Alice, Bob, Charlie, Dave]

`reduce` — Accumulate to a Single Value

List<Integer> numbers = List.of(1, 2, 3, 4, 5);

int sum = numbers.stream()
    .reduce(0, (acc, n) -> acc + n); // → 15
    // identity=0, accumulator

// Or using method reference
int product = numbers.stream()
    .reduce(1, Math::multiplyExact);  // → 120

Primitive Streams — Statistics

int[] scores = {87, 92, 78, 95, 88};

IntSummaryStatistics stats = Arrays.stream(scores)
    .summaryStatistics();

stats.getMin();     // 78
stats.getMax();     // 95
stats.getAverage(); // 88.0
stats.getSum();     // 440
stats.getCount();   // 5

Chaining `peek` for Debugging

List<String> result = names.stream()
    .peek(n -> System.out.println("before filter: " + n)) // ← debug only
    .filter(n -> n.length() > 3)
    .peek(n -> System.out.println("after filter: " + n))
    .collect(Collectors.toList());

warning

peek is for debugging only — it's a side-effect operation that runs inside a lazy pipeline. Never use it for production logic.

Stream Reuse — Common Mistake

Stream<String> stream = names.stream().filter(n -> n.startsWith("A"));

List<String> first = stream.collect(Collectors.toList());  // ← OK
List<String> second = stream.collect(Collectors.toList()); // ← throws IllegalStateException: stream has already been operated upon

Each terminal operation exhausts the stream — create a new one for each use.

`Optional` from Stream Terminal Ops

Optional<String> first = names.stream()
    .filter(n -> n.startsWith("Z"))
    .findFirst(); // ← returns Optional.empty() if no match

first.ifPresent(System.out::println); // no-op if empty

Best Practices

Prefer method references over lambdas in stream pipelines — String::toLowerCase is more readable than s -> s.toLowerCase() in a long pipeline.
Use primitive streams (IntStream, LongStream, DoubleStream) when elements are primitives — avoids boxing/unboxing overhead in hot paths.
Don't use peek in production — use it only for debugging during development.
Keep pipelines short — if a single pipeline exceeds 5–6 chained operations, extract parts into named methods or intermediate variables for readability and debuggability.
Prefer collect(Collectors.toList()) or Stream.toList() (Java 16+) over manually adding to a list inside forEach.
Never modify the source collection inside a stream pipeline — doing so is undefined behavior for non-concurrent sources.
Use limit with infinite streams — always pair Stream.iterate or Stream.generate with a limit or takeWhile (Java 9+) to prevent infinite loops.

Common Pitfalls

1. Reusing a consumed stream A stream can only be traversed once. Calling a second terminal operation on the same stream instance throws IllegalStateException. Always create a fresh stream from the source.

2. Expecting lazy operations to run without a terminal Adding filter and map without a terminal operation does nothing. Beginners often write a pipeline, run it, and wonder why nothing printed — there's no forEach or collect at the end.

3. Misusing peek for side effects in production peek is a debugging hook. In a parallel stream, its execution order is undefined. Using it for insertion into a database or audit log will silently produce inconsistent results.

4. Using sorted on a large parallel stream sorted is stateful — it must buffer all elements before returning any. Combined with parallelStream, this negates parallelism and adds merge overhead. Prefer external sorting or a database ORDER BY.

5. Ignoring Optional from findFirst/findAny These return Optional<T>, not T. Calling .get() without checking .isPresent() throws NoSuchElementException. Always use .orElse, .orElseGet, or .ifPresent.

6. Stream.toList() vs Collectors.toList() (Java 16+) Stream.toList() returns an unmodifiable list. Collectors.toList() returns a mutable ArrayList. Know which you need before using one in production code.

Interview Questions

Beginner

Q: What is the difference between an intermediate and a terminal operation? A: Intermediate operations (like filter, map, sorted) are lazy — they return a new stream and do not process any data. Terminal operations (like collect, forEach, count) trigger the actual execution of the entire pipeline and produce a result or side effect.

Q: Can you reuse a stream after calling a terminal operation? A: No. Once a terminal operation is called, the stream is exhausted. A second terminal operation on the same stream throws IllegalStateException. Create a new stream from the source for each use.

Intermediate

Q: What does "lazy evaluation" mean in the context of streams? A: Intermediate operations don't execute until a terminal operation is invoked. This allows the JVM to optimize the pipeline — for example, filter before limit(5) can short-circuit after finding 5 matches without processing the rest of the source.

Q: What is the difference between map and flatMap? A: map applies a function to each element, producing one output per input — the result is a Stream<R> where R can be any type, including collections. flatMap applies a function that returns a stream and then flattens all the inner streams into one. Use flatMap when your mapping function produces a Stream (or collection) and you want a flat result.

Advanced

Q: When should you use a primitive stream vs. Stream<T> for Integer values? A: When working with a large number of primitive values (int, long, double), use IntStream, LongStream, or DoubleStream. Stream<Integer> boxes every primitive into an Integer object, creating GC pressure. Primitive streams also provide built-in methods like sum(), average(), and summaryStatistics() that Stream<Integer> does not.

Follow-up: How do you convert between Stream<Integer> and IntStream? A: Use mapToInt(Integer::intValue) to go from Stream<Integer> to IntStream, and boxed() to go from IntStream back to Stream<Integer>.

What Problem Does It Solve?​

What Is It?​

How It Works​

Pipeline Anatomy​

Lazy Evaluation​

Short-Circuiting Operations​

Stateless vs. Stateful Intermediate Operations​

Stream Sources​

Code Examples​

Basic Pipeline​

flatMap — Flattening Nested Structures​

reduce — Accumulate to a Single Value​

Primitive Streams — Statistics​

Chaining peek for Debugging​

Stream Reuse — Common Mistake​

Optional from Stream Terminal Ops​

Best Practices​

Common Pitfalls​

Interview Questions​

Beginner​

Intermediate​

Advanced​

Further Reading​

Related Notes​