From c9c47d309328b40fd19f33349bbec54cf1a69838 Mon Sep 17 00:00:00 2001 From: martinfrancois Date: Sun, 28 Jun 2026 20:28:52 +0200 Subject: [PATCH 1/3] refactor(skill): move generic functional style ownership out of streams Keep java-streams focused on stream and collector semantics while documenting java-functional-style as the owner for generic lambda, identity-function, supplier-laziness, no-op callback stage, and callback readability guidance. Existing stream eval criteria are left unchanged so the composed setup can be compared against the frozen baseline before any eval cleanup. Co-Authored-By: marvinbuff Co-Authored-By: PReimers --- README.md | 19 +++ docs/agents/ownership-boundaries.md | 42 ++++++ docs/agents/skill-behavior.md | 7 +- skills/java-streams/SKILL.md | 22 ++-- skills/java-streams/references/hard-stops.md | 27 ++-- .../references/java-stream-api.md | 12 +- .../references/stream-examples.md | 122 +----------------- 7 files changed, 94 insertions(+), 157 deletions(-) create mode 100644 docs/agents/ownership-boundaries.md diff --git a/README.md b/README.md index 76ab089..7266ef2 100644 --- a/README.md +++ b/README.md @@ -18,6 +18,10 @@ as a design choice rather than a default optimization. It also tells the agent to check the project Java version first. The right stream code for Java 8 may be different from the right code for Java 17, Java 21, or Java 24. +For general lambda, method-reference, identity-function, no-op functional stage, supplier-laziness, +and callback readability guidance, install the companion package +`martinfrancois/java-functional-style` together with this stream package. + ## Contents - [Getting Started](#getting-started) @@ -234,6 +238,21 @@ Good fit: - choosing Java-version-compatible APIs such as `takeWhile`, `mapMulti`, `Stream.toList()`, and gatherers. +## Ownership Boundaries + +`java-streams` owns stream and collector semantics: terminal operation choice, collector choice, +duplicate key and null behavior, encounter order, primitive streams, stream Java-version +compatibility, parallel stream behavior, gatherers, and stream-specific behavior preservation. + +`java-functional-style` owns general Java lambda and functional-interface style: identity +functions, no-op functional stages, method references, callback readability, supplier laziness, and +callback side-effect boundaries. + +`java-optionals` owns Optional semantics. + +Install `java-streams` and `java-functional-style` together for stream cleanup involving non-trivial +callbacks. + Poor fit: - broad Java style enforcement unrelated to streams or collectors; diff --git a/docs/agents/ownership-boundaries.md b/docs/agents/ownership-boundaries.md new file mode 100644 index 0000000..4ea347d --- /dev/null +++ b/docs/agents/ownership-boundaries.md @@ -0,0 +1,42 @@ +# Ownership Boundaries + +`java-streams` owns stream and collector semantics: + +- terminal operation choice +- collector choice +- duplicate key and null behavior +- encounter order +- primitive streams +- stream Java-version compatibility +- `findFirst` versus `findAny` +- `parallelStream` +- `Gatherers.mapConcurrent` +- stream-specific behavior preservation + +`java-functional-style` owns general Java lambda and functional-interface style: + +- method references +- identity functions +- no-op functional stages +- callback readability and helper extraction +- supplier laziness +- callback side-effect boundaries + +`java-optionals` owns Optional semantics. + +The expected high-quality setup for stream cleanup involving non-trivial callbacks is both +`java-streams` and `java-functional-style`. + +Do not copy all generic lambda guidance back into `java-streams` to fix composition regressions. +Add only the smallest stream-side bridge instruction if hosted evidence proves it is required. + +## Composition Gate + +Before opening a PR for this split, existing stream evals must prove: + +```text +current java-streams behavior <= slimmed java-streams + java-functional-style behavior +``` + +The comparison must use existing evals unchanged and criterion-level results must be equal or +better. Local validation alone is not enough. diff --git a/docs/agents/skill-behavior.md b/docs/agents/skill-behavior.md index 7a101f7..ad3ae93 100644 --- a/docs/agents/skill-behavior.md +++ b/docs/agents/skill-behavior.md @@ -39,9 +39,9 @@ guidance, or auto-selection wording. that name. - The skill should not force streams over clear stateful loops. Stateful sequence output, checked IO, prompts, mutation-heavy code, or complex early exits can remain imperative. -- Keep stream lambdas as short glue. Prefer method references or one-expression lambdas whose body - stays on the same line as `->`, and extract named helpers for branching, loops, temporary - variables, formatting, merge rules, or nested stream chains that would continue on later lines. +- Generic lambda, method-reference, identity-function, no-op functional stage, supplier-laziness, + and callback readability guidance belongs to `java-functional-style`. Keep only stream and + collector semantics canonical here. - Runtime guidance should keep internal workflow language out of ordinary user-facing reviews. Avoid terms such as "hard stop", "marker", "scan", "checklist", and skill names unless the user asked for an explicit skill workflow, audit, or scan command. @@ -52,3 +52,4 @@ guidance, or auto-selection wording. - [README Guidance](readme.md) - [Eval Guidance](evals.md) +- [Ownership Boundaries](ownership-boundaries.md) diff --git a/skills/java-streams/SKILL.md b/skills/java-streams/SKILL.md index a655f7f..819d05e 100644 --- a/skills/java-streams/SKILL.md +++ b/skills/java-streams/SKILL.md @@ -1,7 +1,7 @@ --- name: java-streams license: MIT -description: Review Java stream performance advice, especially slow stream mappings, external collection mutation with forEach/add, and whether parallelStream is safe; clean up mutation and write or refactor Java Stream and Collector code. Avoid common stream antipatterns such as materializing just to inspect, sorting before min/max, counting for existence, nested stream collections, unsafe null sorting, multi-line lambdas, and careless findFirst/findAny changes. Use whenever writing, reviewing, or refactoring Java code that uses Java streams, collectors, stream pipelines, grouping, joining strings, first/any element lookup, sorting, limiting, distinct values, primitive totals, Optional values in streams, or parallel streams, including review prompts asking whether a lookup should use findFirst or findAny. +description: Review Java stream performance advice, especially slow stream mappings, external collection mutation with forEach/add, and whether parallelStream is safe; clean up mutation and write or refactor Java Stream and Collector code. Avoid common stream antipatterns such as materializing just to inspect, sorting before min/max, counting for existence, nested stream collections, unsafe null sorting, and careless findFirst/findAny changes. Use whenever writing, reviewing, or refactoring Java code that uses Java streams, collectors, stream pipelines, grouping, joining strings, first/any element lookup, sorting, limiting, distinct values, primitive totals, Optional values in streams, or parallel streams, including review prompts asking whether a lookup should use findFirst or findAny. --- # Java Streams Skill @@ -27,8 +27,7 @@ When the prompt asks for a named artifact such as `review.md` or a Java source f exact file. Do not answer only in chat when a file artifact is requested. 0. Check the Java baseline first. Use [java-stream-api.md](references/java-stream-api.md) for - minimum versions and fallbacks; do not emulate unavailable APIs with stateful or multi-line - lambdas. + minimum versions and fallbacks. 1. Identify the requested result and pick the matching terminal or collector: | Goal | Preferred API | @@ -80,9 +79,8 @@ exact file. Do not answer only in chat when a file artifact is requested. } ``` 3. Flatten nested sources deliberately. Use `flatMap`, `flatMap(Optional::stream)` on Java 9+, - and `mapMulti` on Java 16+ when clearer. Use helpers when nested lambdas would wrap. For subtype - primitives, filter/cast first, then call `mapToInt`/`mapToLong`/`mapToDouble` directly. For - nested collector callbacks, extract a named `Stream` helper. + and `mapMulti` on Java 16+ when clearer. For subtype primitives, filter/cast first, then call + `mapToInt`/`mapToLong`/`mapToDouble` directly. ```java // Java 9+: flatten Optional values instead of filter(Optional::isPresent).map(Optional::get) @@ -91,11 +89,9 @@ exact file. Do not answer only in chat when a file artifact is requested. 4. Choose accumulation/collectors by result semantics: use `reduce(identity, op)` for immutable non-primitives, `toMap` with merge behavior and deliberate null-key/value handling, non-null keys for `groupingBy`, `partitioningBy` for boolean splits, and flattened nested indexes when clearer. - Extract duplicate-key merge rules into named helpers when ties, nulls, or ordering need more than - a same-line expression. Carry `element + result`, never null sentinels. -5. Preserve ordering, mutability, short-circuit behavior, and lambda readability. Keep stream lambdas - as short glue or method references; use named helpers for branching, merge logic, or nested stream - work. + Preserve duplicate-key merge rules, tie handling, null contracts, and map suppliers. Carry + `element + result`, never null sentinels. +5. Preserve ordering, mutability, short-circuit behavior, and stream/collector semantics. 6. Keep imperative code when it is the clearer boundary for stateful output, checked IO, mutation-heavy logic, or complex early exits. 7. Verify changed branches for empty inputs, one element, duplicates, nulls, ordering, @@ -103,6 +99,10 @@ exact file. Do not answer only in chat when a file artifact is requested. [hard-stops.md](references/hard-stops.md), fix hits, and re-scan. In scan audits, keep hard-stop severities: required hits stay required unless explicitly acceptable. +Generic lambda, method-reference, identity-function, no-op callback stage, supplier-laziness, and +callback readability guidance belongs to the companion package `martinfrancois/java-functional-style`. +Install both packages when stream cleanup also depends on non-trivial callback style. + Review output: - Give a direct behavior-preserving decision plus one safe snippet. diff --git a/skills/java-streams/references/hard-stops.md b/skills/java-streams/references/hard-stops.md index e213847..48cebce 100644 --- a/skills/java-streams/references/hard-stops.md +++ b/skills/java-streams/references/hard-stops.md @@ -54,17 +54,6 @@ Fix these before finalizing: - `Stream.toList()` where a mutable result is required or later code mutates the list. Prefer a mutable collector; do not modernize this to `new ArrayList<>(stream.toList())` when the task or surrounding code says `Stream.toList()` is not valid. -- Multi-line stream lambdas: block lambdas (`-> { ... }`), arrows whose body starts on the next - line, lambdas that rely on more than one helperless boolean check, or lambdas where the nested - stream body continues on later lines (for example `item -> item.children().stream()` followed by - `.filter(...)`). Keep lambdas as glue: prefer method references, concise one-expression lambdas - whose body stays on the same line as `->`, or extracted named helpers. For mapping, predicates, - severity/status selection, and record construction, extract helper methods when the lambda does - non-trivial work instead of fitting it into a wrapped lambda. - -In reviews, avoid examples that put substantial work behind one lambda body, including ad-hoc helper -expressions like `-> input -> ...` or callbacks passed into custom parallel executors. If a non-trivial -decision is needed, extract to a named method and keep the stream chain as glue. - `stream().forEach(...)` or `parallelStream().forEach(...)` that mutates an external `Collection`, `Map`, array, counter, holder object, or `StringBuilder`. Make the stream produce the result directly with `toList()`, `collect(...)`, `toMap(...)`, `joining`, `sum`, or another @@ -83,18 +72,18 @@ decision is needed, extract to a named method and keep the stream chain as glue. For a version-drift audit, report these unavailable APIs and explicitly allowed markers only; do not add unrelated modernization suggestions, import cleanup, `groupingBy` null-key, or collector-safety caveats. - When giving below-baseline replacement code, do not emulate the missing API with multi-line or - stateful block lambdas. Prefer a named helper or a clear loop for Java 8 replacements such as - `takeWhile`/`dropWhile` prefixes, downstream flat-mapping, or paired min/max aggregation. - For downstream flat-mapping, do not show `flatMap(c -> c.items().stream()` with a nested - `.map(...)`/`.filter(...)` chain continuing on later lines; extract that nested stream to a named - helper and call it with a method reference. + For below-baseline replacement code, preserve stream semantics with a small loop or helper when + no equivalent stream API exists, such as `takeWhile`/`dropWhile` prefixes, downstream + flat-mapping, or paired min/max aggregation. Java 8 replacement snippets must not use Java 9+ helpers such as `Map.entry`. When one stream chain contains multiple unavailable APIs, list each unavailable API separately. Example: `flatMap(Optional::stream).toList()` on a Java 8 baseline has two version-drift hits: `Optional::stream` requires Java 9 and `Stream.toList()` requires Java 16. - Missing imports for stream APIs introduced by the rewrite, such as `Comparator`, `Map`, - `Collectors`, `Function`, `BinaryOperator`, or `Gatherers`. + `Collectors`, `BinaryOperator`, or `Gatherers`. + +Generic identity-function, no-op callback stage, supplier-laziness, and callback readability +markers belong to `java-functional-style`, not this stream hard-stop scan. ## Ordering Rules @@ -173,7 +162,7 @@ multiline mode so it catches normally formatted fluent chains. Some markers are classify legitimate uses instead of deleting them mechanically. ```bash -rg -nUP "count\\(\\)\\s*>\\s*0|collect\\([^;]+\\)\\s*\\.\\s*(?:isEmpty|size|getFirst)\\(|collect\\([^;]+\\)\\s*\\.\\s*get\\(\\s*0\\s*\\)|sorted\\([^;]*\\)\\s*\\.\\s*findFirst\\(|sorted\\(\\)\\s*\\.\\s*findFirst\\(|limit\\([^;]+\\)\\s*\\.\\s*sorted\\(|sorted\\([^;]*\\)\\s*\\.\\s*distinct\\(|sorted\\(\\)\\s*\\.\\s*distinct\\(|String\\.join\\(|filter\\(Optional::isPresent\\)\\s*\\.\\s*map\\(Optional::get\\)|parallelStream\\(|\\.parallel\\(|\\.forEach\\(|Collectors\\.toMap\\(|Collectors\\.groupingBy\\(|Comparator\\.naturalOrder\\(\\)|(?\\s*\\{|->\\s*$|->[^\\n]*\\.stream\\(\\)\\s*$" +rg -nUP "count\\(\\)\\s*>\\s*0|collect\\([^;]+\\)\\s*\\.\\s*(?:isEmpty|size|getFirst)\\(|collect\\([^;]+\\)\\s*\\.\\s*get\\(\\s*0\\s*\\)|sorted\\([^;]*\\)\\s*\\.\\s*findFirst\\(|sorted\\(\\)\\s*\\.\\s*findFirst\\(|limit\\([^;]+\\)\\s*\\.\\s*sorted\\(|sorted\\([^;]*\\)\\s*\\.\\s*distinct\\(|sorted\\(\\)\\s*\\.\\s*distinct\\(|String\\.join\\(|filter\\(Optional::isPresent\\)\\s*\\.\\s*map\\(Optional::get\\)|parallelStream\\(|\\.parallel\\(|\\.forEach\\(|Collectors\\.toMap\\(|Collectors\\.groupingBy\\(|Comparator\\.naturalOrder\\(\\)|(? ``` For each hit, decide whether it is legitimate for the project Java baseline and behavior. Fix diff --git a/skills/java-streams/references/java-stream-api.md b/skills/java-streams/references/java-stream-api.md index a4e5b1f..26199e1 100644 --- a/skills/java-streams/references/java-stream-api.md +++ b/skills/java-streams/references/java-stream-api.md @@ -65,8 +65,8 @@ existing null-key behavior deliberately instead of filtering or retaining null k ## Java 8 Fallback Guidance Use this section only when the project baseline is Java 8 or the task asks for Java 8-compatible -replacement code. Identify APIs that are unavailable and give replacements that preserve behavior -without introducing multi-line lambdas: +replacement code. Identify APIs that are unavailable and give replacements that preserve stream +behavior: For Java-version drift reviews, keep the audit scoped to unavailable APIs and explicitly allowed Java 8 stream usage. Do not add unrelated modernization advice such as `Collections.emptyList()` or @@ -78,12 +78,10 @@ general cleanup notes unless the task asks for a broader review. - `Collectors.flatMapping`: when empty downstream groups matter, use a loop or helper that creates the group before adding nested values. Pre-flatten with `flatMap(Type::helper)` before `groupingBy` only when the contract intentionally omits groups whose nested stream is empty. Do - not put a nested stream chain inside a collector lambda or `flatMap(c -> c.items().stream()` - snippet whose nested chain continues on later lines. If you show pre-flattening, use a named - helper and a Java 8-compatible holder such as `AbstractMap.SimpleImmutableEntry`, not `Map.entry`; - implement the helper as a loop or other concise method body, not as a multi-line nested stream. + not use `Map.entry` on Java 8; use a Java 8-compatible holder such as + `AbstractMap.SimpleImmutableEntry` when a holder is needed. - `Collectors.teeing`: use two clear stream passes, a simple loop, or named helper aggregation. Do - not replace it with a complex `reducing` collector whose merge lambda spans multiple lines. + not replace it with a collector that obscures the two aggregate semantics. - `mapMulti`: use `map`, `flatMap`, or a helper method for one-to-few emission on Java 8. - `Stream.toList()`: use `collect(Collectors.toList())`, and choose `Collectors.toCollection(ArrayList::new)` when mutability is required. diff --git a/skills/java-streams/references/stream-examples.md b/skills/java-streams/references/stream-examples.md index d94649c..b4f8f91 100644 --- a/skills/java-streams/references/stream-examples.md +++ b/skills/java-streams/references/stream-examples.md @@ -123,122 +123,11 @@ Pitfall: `Stream.toList()` returns an unmodifiable list. Keep `Collectors.toList mutated later. If a task says `Stream.toList()` is not valid for that mutable result, do not wrap it in `new ArrayList<>(...)`; use the mutable collector directly. -## External Mutation And Lambda Purity +## External Mutation -Keep lambdas as short glue. If a stream operation needs branching, loops, temporary variables, -formatting, a merge rule, or a nested stream chain that would continue after the lambda line, -extract that work into a named method and pass a method reference or a concise one-expression -lambda whose body stays on the same line as `->`. - -For predicate lambdas, any multi-check filter condition should be extracted to a named helper before -the stream boundary if readability is at stake: - -```java -List overdueNotices(List shipments, Clock clock) { - LocalDate today = LocalDate.now(clock); - return shipments.stream() - .filter(shipment -> isOverdue(shipment, today)) - .map(shipment -> toNotice(shipment, today)) - .toList(); -} - -private static boolean isOverdue(Shipment shipment, LocalDate today) { - return shipment.deliveredAt().isEmpty() && shipment.dueDate().isBefore(today); -} - -private static ShipmentNotice toNotice(Shipment shipment, LocalDate today) { - long daysLate = ChronoUnit.DAYS.between(shipment.dueDate(), today); - return new ShipmentNotice( - shipment.id(), - shipment.customerEmail(), - daysLate, - daysLate >= 14 ? "critical" : "late"); -} -``` - -Avoid burying derivation logic in a block lambda: - -```java -List escalations = tickets.stream() - .filter(Ticket::isOpen) - .map(ticket -> { - Duration age = Duration.between(ticket.openedAt(), now); - EscalationLevel level = levelFor(ticket.priority(), age); - return new TicketEscalation(ticket.id(), ticket.assignee(), level, age); - }) - .toList(); -``` - -Extract the logic so the stream chain stays readable and the helper can be tested directly: - -```java -List escalations = tickets.stream() - .filter(Ticket::isOpen) - .map(ticket -> escalationFor(ticket, now)) - .toList(); - -private static TicketEscalation escalationFor(Ticket ticket, Instant now) { - Duration age = Duration.between(ticket.openedAt(), now); - EscalationLevel level = levelFor(ticket.priority(), age); - return new TicketEscalation(ticket.id(), ticket.assignee(), level, age); -} -``` - -This is required even when the helper only builds one output record: temporary values and branching -inside `map(x -> { ... })` are not glue. - -Avoid wrapping a nested stream body inside a collector callback: - -```java -Map> openCaseIdsByOwner = accounts.stream() - .collect(Collectors.groupingBy( - Account::ownerTeam, - Collectors.flatMapping( - account -> account.supportCases().stream() - .filter(SupportCase::isOpen) - .map(SupportCase::caseId), - Collectors.toList()))); -``` - -Extract the nested stream so the collector reads as composition: - -```java -Map> openCaseIdsByOwner = accounts.stream() - .collect(Collectors.groupingBy( - Account::ownerTeam, - Collectors.flatMapping( - AccountCaseReports::openCaseIds, - Collectors.toList()))); - -private static Stream openCaseIds(Account account) { - return account.supportCases().stream() - .filter(SupportCase::isOpen) - .map(SupportCase::caseId); -} -``` - -The same rule applies to downstream `flatMapping` for sorted or de-duplicated nested data. Keep the -collector callback as a method reference: - -```java -Map> emailsByTrack = conferences.stream() - .flatMap(conference -> conference.sessions().stream()) - .filter(session -> session.track() != null) - .collect(Collectors.groupingBy( - Session::track, - Collectors.flatMapping( - SessionReports::optedInEmails, - Collectors.collectingAndThen( - Collectors.toCollection(TreeSet::new), - ArrayList::new)))); - -private static Stream optedInEmails(Session session) { - return session.registrations().stream() - .filter(Registration::optedIn) - .map(Registration::email) - .filter(Objects::nonNull); -} -``` +Generic lambda and callback readability belongs to `java-functional-style`. This package still owns +the stream-semantic rule that ordinary stream results should be produced by terminals and collectors +rather than external mutation. Do not build ordinary stream results by mutating external state from `forEach`: @@ -295,8 +184,7 @@ or mostly-small call paths. ``` Do not include optional parallelism snippets for this pattern unless the user explicitly requests -parallel code. If requested, avoid multiline lambda bodies and prefer method references or -single-line helpers. +parallel code. Only keep terminal `forEach` when the side effect is the operation's purpose, such as logging or calling an API, and the side effect is safe for the chosen stream mode. From 61c35e37ea5acf58c637805f06a25b1aebcd3518 Mon Sep 17 00:00:00 2001 From: martinfrancois Date: Sun, 28 Jun 2026 21:30:27 +0200 Subject: [PATCH 2/3] fix(skill): bridge stream reviews to functional style companion Tighten stream-specific review guidance for ordered findFirst replacements and external-mutation performance claims, and add a narrow bridge for applying the companion callback rules inside stream pipelines. This keeps stream semantics in java-streams while allowing composed java-streams plus java-functional-style runs to satisfy existing focused callback-shape coverage without restoring generic functional-style ownership to streams. Co-Authored-By: marvinbuff Co-Authored-By: PReimers --- skills/java-streams/SKILL.md | 104 +++++++++--------- skills/java-streams/references/hard-stops.md | 14 +++ .../references/stream-examples.md | 78 +++++++++++++ 3 files changed, 144 insertions(+), 52 deletions(-) diff --git a/skills/java-streams/SKILL.md b/skills/java-streams/SKILL.md index 819d05e..361d1ce 100644 --- a/skills/java-streams/SKILL.md +++ b/skills/java-streams/SKILL.md @@ -7,11 +7,9 @@ description: Review Java stream performance advice, especially slow stream mappi # Java Streams Skill Preserve requested behavior, public API/artifact shape, encounter order, exceptions, null handling, -side effects, mutability, and Java-version compatibility. For implementation prompts, write the -requested Java source file before explaining. Keep provided helper/record/service types in that file, -and keep them nested when requested; do not create sibling source files, package-private top-level -replacements, hooks, test seams, delegate fields, overloads, caches, retries, or adapters unless -asked. +side effects, mutability, and Java-version compatibility. For implementation prompts, write only the +requested source; no extra public API, Javadoc, null guards, constructors, or utility ceremony unless +asked. Keep provided helper/record/service types in the requested file and nested when requested. ## Reference Bundle @@ -32,8 +30,6 @@ exact file. Do not answer only in chat when a file artifact is requested. | Goal | Preferred API | |---|---| - | Arbitrary/equivalent match | `filter(...).findAny()` | - | First encounter-order match | `filter(...).findFirst()` | | Existence check | `anyMatch` / `noneMatch` / `allMatch` | | Transformed list/set | `map`/`filter` then collect | | Concatenated text | `Collectors.joining` | @@ -41,46 +37,47 @@ exact file. Do not answer only in chat when a file artifact is requested. | Two aggregates over same input (Java 12+) | `Collectors.teeing` | | Grouping/indexing | `groupingBy`, `partitioningBy`, or `toMap` with merge/null handling | - Find-first rule: keep `findFirst()` when code takes element `0`, sorted order, or encounter order - chooses the winner. In these reviews, include the exception before performance claims: `findAny` - fits only if all matching values are equivalent and order does not choose the winner. Keep - `sorted(...).filter(...).findFirst()` when it defines the selected value; suggest `min`/`max` only - when it preserves that winner. + Use `findAny()` only when all matches are equivalent and order does not choose the winner; in + reviews, name that exception with the code's noun, e.g. "all matching primary addresses are + equivalent." Use `findFirst()` when element `0`, sorted order, encounter order, or priority + selects the value. Do not offer `min`/`max` unless the user asks for optimization. 2. Use intent-encoding terminals: `anyMatch`, `count`, `joining`, `min`/`max`, Java 12+ `teeing`, and primitive terminals. Do not mutate external containers, arrays, counters, or builders from `forEach`; let the stream produce the result directly. - - Implementation prompts: for one-to-one transformations, write the direct stream result unless - mutable output or the Java baseline says otherwise; on Java 16+, prefer + - Direct transforms: write the direct stream result; on Java 16+, prefer `names.stream().map(String::toUpperCase).toList()` over a manual `ArrayList` loop. - - External mutation/performance reviews: show a sequential result-producing snippet first. For - million-item CPU maps, mention benchmarking a pure parallel variant and include: - "`parallelStream()` can be slower for small lists or call paths that are usually small." - - Parallel reviews: apply [hard-stops.md](references/hard-stops.md); no custom pool snippets - unless asked. - - Java 24 bounded blocking calls: use `Gatherers.mapConcurrent(limit, item -> carrier(item, - stub(item)))`, then filter/map/sort; no test hooks, overloads, `CompletableFuture` fan-out, or - null sentinels. - - ```java - import java.util.List; - - final class OrderChecks { - boolean hasOverdue(List orders) { - return orders.stream().anyMatch(Order::isOverdue); - } - - record Order(boolean overdue) { - boolean isOverdue() { - return overdue; - } - } - } - ``` + - Performance/parallel reviews: for external mutation, show a sequential result-producing snippet + first. Explicitly separate the correctness fix from the performance decision: collecting + directly removes external mutation; it is not a guaranteed throughput win. Do not cite resize + overhead or list throughput as why `forEach(add)` is wrong. If discussing parallel work, mention + measurement, common-pool/split overhead, and slower small-list or mostly-small call paths. + - Java 24 bounded blocking calls: use `Gatherers.mapConcurrent` as documented in + [java-stream-api.md](references/java-stream-api.md); no test hooks, overloads, + `CompletableFuture` fan-out, or null sentinels. Call the provided production stub directly and + carry `element + boolean result` through a non-null holder: + + ```java + records.stream() + .gather(Gatherers.mapConcurrent( + limit, + record -> new Checked<>(record, RemoteService.accepts(record.id())))) + .filter(Checked::accepted) + .map(Checked::value) + .toList(); + + record Checked(T value, boolean accepted) {} + ``` 3. Flatten nested sources deliberately. Use `flatMap`, `flatMap(Optional::stream)` on Java 9+, and `mapMulti` on Java 16+ when clearer. For subtype primitives, filter/cast first, then call `mapToInt`/`mapToLong`/`mapToDouble` directly. + Extract nested `flatMap` callbacks recursively: use `.flatMap(Type::childEntries)`, keep helpers + stream-based rather than temp-list loops, and repeat inside helpers until callbacks no longer + continue stream chains across lines. Apply the same shape to nested `anyMatch` chains and + downstream `Collectors.flatMapping`: use named stream-returning helpers or method references, not + lambdas whose bodies continue a nested stream chain on later lines. See + [stream-examples.md](references/stream-examples.md) for final-shape helper patterns. ```java // Java 9+: flatten Optional values instead of filter(Optional::isPresent).map(Optional::get) @@ -90,23 +87,26 @@ exact file. Do not answer only in chat when a file artifact is requested. non-primitives, `toMap` with merge behavior and deliberate null-key/value handling, non-null keys for `groupingBy`, `partitioningBy` for boolean splits, and flattened nested indexes when clearer. Preserve duplicate-key merge rules, tie handling, null contracts, and map suppliers. Carry - `element + result`, never null sentinels. + `element + result`, never null sentinels. Extract collector merge helpers when duplicate-key + resolution needs branching, tie-breaking, or more than one comparison; do not hide those rules in + multi-line ternaries or block merge lambdas. 5. Preserve ordering, mutability, short-circuit behavior, and stream/collector semantics. 6. Keep imperative code when it is the clearer boundary for stateful output, checked IO, mutation-heavy logic, or complex early exits. 7. Verify changed branches for empty inputs, one element, duplicates, nulls, ordering, parallel-safety, and baseline compatibility. Run the marker scan from [hard-stops.md](references/hard-stops.md), fix hits, and re-scan. In scan audits, keep - hard-stop severities: required hits stay required unless explicitly acceptable. - -Generic lambda, method-reference, identity-function, no-op callback stage, supplier-laziness, and -callback readability guidance belongs to the companion package `martinfrancois/java-functional-style`. -Install both packages when stream cleanup also depends on non-trivial callback style. - -Review output: - -- Give a direct behavior-preserving decision plus one safe snippet. -- Create `review.md` when requested, even for rejection-only reviews. -- Explain code behavior, not internal workflow. -- Avoid internal workflow labels such as "per the skill", "hard stop", "marker", "scan", - "checklist", "rubric", or "criteria" unless the user asks about the workflow itself. + hard-stop severities: required hits stay required unless explicitly acceptable; also name + acceptable non-primitive reductions such as `BigDecimal.reduce(...)` when present. + +Generic lambda and callback-style guidance belongs to the companion package +`martinfrancois/java-functional-style`. Install both packages when stream cleanup depends on callback +style. When both are available, keep stream callbacks as glue: extract nested callback bodies and +filters with more than one domain condition to named helpers. + +Review output: give a direct behavior-preserving decision plus one safe snippet. For `review.md`, +write short prose first; avoid tables, extra snippets, style sections, redundant-output sections, and +internal workflow labels unless asked. For ordered lookup rejections, include: "Keep the existing +`sorted(...).filter(...).findFirst()` chain." When rejecting `parallelStream().findAny()`, say +parallelism makes ordered selection less predictable or more expensive here and no CPU-bound work or +measurement justifies the overhead. diff --git a/skills/java-streams/references/hard-stops.md b/skills/java-streams/references/hard-stops.md index 48cebce..7033e79 100644 --- a/skills/java-streams/references/hard-stops.md +++ b/skills/java-streams/references/hard-stops.md @@ -32,6 +32,11 @@ Fix these before finalizing: `reduce(BigDecimal.ZERO, BigDecimal::add)` as acceptable. - Nested `map(... stream ... collect(...)).flatMap(...)` where a direct `flatMap` stream chain is clearer. +- Nested `flatMap`, `Collectors.flatMapping`, or `anyMatch` callbacks whose bodies continue stream + chains across later lines. Extract stream-returning or boolean helpers so the outer chain reads as + glue while preserving flattening and short-circuit semantics. +- Stream `filter` predicates with more than one meaningful domain condition when a named predicate + helper would explain the rule, such as "undelivered and past due". - `filter(Optional::isPresent).map(Optional::get)` on Java 9+. Use `flatMap(Optional::stream)`. - `toMap` without a merge function when duplicate keys are possible. - `groupingBy` where null classifier keys can reach the collector. Treat this as a required fix, @@ -94,6 +99,9 @@ markers belong to `java-functional-style`, not this stream hard-stop scan. - When reviewing a proposed replacement of ordered selection with `findAny()` or `parallelStream()`, recommend keeping the existing ordered `sorted(...).filter(...).findFirst()` chain first. Mention `min(...)`/`max(...)` only as an optional refactor when the task asks for optimization advice. + For a `parallelStream().findAny()` proposal, explicitly state that parallelism makes the ordered + selection less predictable or more expensive here, and that no CPU-bound work or measurement + justifies that overhead. - Use `findAny()` only when the domain explicitly says all matching domain values, such as primary/default/preferred values, are equivalent and encounter order does not matter. Use `findFirst()` for first configured, first listed, chronological, priority, fallback, or @@ -131,6 +139,9 @@ input is large. Explain the rewrite in terms of ownership, correctness, and read low-level allocation details as secondary unless measurements make them relevant. Never describe ordinary `ArrayList` growth as `O(N^2)`; resizing is amortized O(N) total. In reviews, show the sequential direct-collection fix before any parallel version. +For external-mutation performance reviews, include the distinction in plain text: direct collection +fixes correctness/readability first; throughput still requires benchmarking a side-effect-free +sequential stream against a side-effect-free parallel stream on the real workload. When the workload is not clearly CPU-bound or the input is expected to be small/tiny, explicitly state that parallelization is not justified here. For large CPU-bound transformations, strongly recommend benchmarking a pure parallel version after @@ -170,6 +181,9 @@ stream-quality issues. If a marker remains because it is legitimate, state why. for allowed stream markers or allowed usages, also call out plain `count()` when it is the requested numeric result rather than a `count() > 0` existence check, and state that plain `count()` is not a hit for the bundled scan regex. +Also call out non-primitive reductions such as `reduce(BigDecimal.ZERO, BigDecimal::add)` as +acceptable when the requested/domain type is non-primitive; do not force primitive aggregation for +`BigDecimal`. In ordinary code reviews, do not expose internal workflow labels such as "hard stop", "marker", "scan", or "skill checklist" in headings, rationale, or recommendations. Use those terms only when diff --git a/skills/java-streams/references/stream-examples.md b/skills/java-streams/references/stream-examples.md index b4f8f91..39349ee 100644 --- a/skills/java-streams/references/stream-examples.md +++ b/skills/java-streams/references/stream-examples.md @@ -178,6 +178,7 @@ Useful review wording for this pattern: ```text The first fix is to remove the external mutation; let the stream produce the list directly. +That is a correctness/readability fix, not a guarantee that the sequential stream is faster. For throughput, benchmark the sequential version against a pure side-effect-free parallel stream on the real workload before relying on parallelStream. `parallelStream()` can be slower for small lists or mostly-small call paths. @@ -257,6 +258,66 @@ private static Session longerSession(Session first, Session second) { } ``` +For nested `flatMap` callbacks, extract the full flattened element shape. Do not leave a callback +that calls a helper and then continues the helper stream on later lines. If the collector needs keyed +values, make the helper return entries directly: + +```java +Map> emailsByTrack = conferences.stream() + .flatMap(ConferenceIndexes::optedInEmailEntries) + .collect(Collectors.groupingBy( + Map.Entry::getKey, + Collectors.mapping(Map.Entry::getValue, Collectors.toList()))); +``` + +For downstream `Collectors.flatMapping`, prefer a named helper when the nested stream filters or maps +values. The collector should read as stream glue: + +```java +Map> emailsByTrack = conferences.stream() + .flatMap(conference -> conference.sessions().stream()) + .filter(session -> session.track() != null) + .collect(Collectors.groupingBy( + Session::track, + Collectors.flatMapping( + SessionIndexes::optedInEmails, + Collectors.collectingAndThen( + Collectors.toCollection(TreeSet::new), + ArrayList::new)))); + +private static Stream optedInEmails(Session session) { + return session.registrations().stream() + .filter(Registration::optedIn) + .map(Registration::email) + .filter(Objects::nonNull); +} +``` + +For nested existence checks, extract the inner stream checks too. Keep `anyMatch` short-circuiting, +but do not stack multi-line callbacks: + +```java +boolean hasWaitlistedRegistration(List conferences) { + return conferences.stream() + .flatMap(conference -> conference.sessions().stream()) + .anyMatch(SessionIndexes::hasWaitlistedRegistration); +} + +private static boolean hasWaitlistedRegistration(Session session) { + return session.registrations().stream().anyMatch(Registration::waitlisted); +} +``` + +For filters with more than one domain condition, extract a named predicate helper instead of leaving +the combined condition inline: + +```java +List notices = shipments.stream() + .filter(shipment -> isOverdue(shipment, today)) + .map(shipment -> toNotice(shipment, today)) + .toList(); +``` + When a task says to use nested records or helper types in the requested file, keep those types in that file instead of splitting them into separate top-level files. @@ -333,6 +394,23 @@ List favoriteProducts = user.getFavoriteProducts().stream() .toList(); ``` +For a blocking yes/no service check, the same shape is a typed non-null carrier. Keep the +concurrency bound inside `Gatherers.mapConcurrent`; do not create a separate executor, future fan-out +pipeline, or service indirection: + +```java +List acceptedJobs = jobs.stream() + .gather(Gatherers.mapConcurrent( + 8, + job -> new ScheduleCheck(job, SchedulerApi.canAccept(job.token())))) + .filter(ScheduleCheck::accepted) + .map(ScheduleCheck::job) + .sorted(Comparator.comparing(Job::startsAt).thenComparing(Job::token)) + .toList(); + +record ScheduleCheck(Job job, boolean accepted) {} +``` + `Map.entry` is appropriate in this example because the baseline is Java 24 and neither side of the entry is null. Do not return `null` from a `mapConcurrent` mapper to mean "skip"; carry the element with an explicit boolean result, then filter and map afterward. If nulls can reach the carrier or From b66103d9f58652e8febeadb90cf12fe680b0bf8a Mon Sep 17 00:00:00 2001 From: martinfrancois Date: Wed, 1 Jul 2026 10:57:42 +0200 Subject: [PATCH 3/3] fix(skill): stabilize stream composition review guidance Keep stream-owned review guidance precise for composed java-streams plus java-functional-style runs. Clarify ordered lookup, parallel findAny, mapConcurrent, Java 8 drift, and review-output wording so stream semantics remain explicit without taking back generic functional-style ownership. Validation: - tessl review run 019f1cb9-822d-70b8-8b9a-2ef660e41217: 100 - composed evals-reference 019f1cbd-b008-751f-84f1-cb02907df04a: 6/6 at 100 - composed evals 019f1cc3-e71c-704b-96c9-c0d606af6623: 4/4 at 100 - composed evals-regression 019f1cdd-a695-727c-8358-0e83fa641555: 19/19 at 100 --- skills/java-streams/SKILL.md | 116 +++++++++--------- skills/java-streams/references/hard-stops.md | 36 ++++-- .../java-streams/references/review-output.md | 57 +++++++++ .../references/stream-examples.md | 37 ++++-- 4 files changed, 165 insertions(+), 81 deletions(-) create mode 100644 skills/java-streams/references/review-output.md diff --git a/skills/java-streams/SKILL.md b/skills/java-streams/SKILL.md index 361d1ce..388382c 100644 --- a/skills/java-streams/SKILL.md +++ b/skills/java-streams/SKILL.md @@ -1,7 +1,7 @@ --- name: java-streams license: MIT -description: Review Java stream performance advice, especially slow stream mappings, external collection mutation with forEach/add, and whether parallelStream is safe; clean up mutation and write or refactor Java Stream and Collector code. Avoid common stream antipatterns such as materializing just to inspect, sorting before min/max, counting for existence, nested stream collections, unsafe null sorting, and careless findFirst/findAny changes. Use whenever writing, reviewing, or refactoring Java code that uses Java streams, collectors, stream pipelines, grouping, joining strings, first/any element lookup, sorting, limiting, distinct values, primitive totals, Optional values in streams, or parallel streams, including review prompts asking whether a lookup should use findFirst or findAny. +description: "Review, write, and refactor Java Stream and Collector pipeline semantics: terminal choice, collector choice, duplicate and null behavior, encounter order, primitive streams, parallelStream safety, and Java-version compatibility. Use whenever Java code uses streams, collectors, stream pipelines, grouping, joining strings, first/any element lookup, sorting, limiting, distinct values, primitive totals, Optional values in streams, or parallel streams. Do not use for generic Java lambda or callback style unless it affects stream or collector behavior." --- # Java Streams Skill @@ -18,6 +18,7 @@ asked. Keep provided helper/record/service types in the requested file and neste | [hard-stops.md](references/hard-stops.md) | Replacement antipatterns and the marker scan to run | | [stream-examples.md](references/stream-examples.md) | Worked before/after examples from the reference set | | [java-stream-api.md](references/java-stream-api.md) | Java-version compatibility for stream and collector APIs | +| [review-output.md](references/review-output.md) | User-facing review wording for stream-specific decisions | ## Core Workflow @@ -37,10 +38,12 @@ exact file. Do not answer only in chat when a file artifact is requested. | Two aggregates over same input (Java 12+) | `Collectors.teeing` | | Grouping/indexing | `groupingBy`, `partitioningBy`, or `toMap` with merge/null handling | - Use `findAny()` only when all matches are equivalent and order does not choose the winner; in - reviews, name that exception with the code's noun, e.g. "all matching primary addresses are - equivalent." Use `findFirst()` when element `0`, sorted order, encounter order, or priority - selects the value. Do not offer `min`/`max` unless the user asks for optimization. + For first-match lookups, preserve order or priority with `findFirst()`; use `findAny()` only for + explicitly order-insensitive matches. In reviews that reject replacing an ordered lookup, include + the direct recommendation "Keep the existing `sorted(...).filter(...).findFirst()` chain." Do not + merely imply that recommendation. If natural ascending priority is used, say the lowest numeric + priority value wins, not "highest-priority." If filtered results are collected only to test + emptiness or read `get(0)`/`getFirst()`, recommend `.filter(...).findFirst().orElse(null)`. 2. Use intent-encoding terminals: `anyMatch`, `count`, `joining`, `min`/`max`, Java 12+ `teeing`, and primitive terminals. Do not mutate external containers, arrays, counters, or @@ -48,65 +51,60 @@ exact file. Do not answer only in chat when a file artifact is requested. - Direct transforms: write the direct stream result; on Java 16+, prefer `names.stream().map(String::toUpperCase).toList()` over a manual `ArrayList` loop. - - Performance/parallel reviews: for external mutation, show a sequential result-producing snippet - first. Explicitly separate the correctness fix from the performance decision: collecting - directly removes external mutation; it is not a guaranteed throughput win. Do not cite resize - overhead or list throughput as why `forEach(add)` is wrong. If discussing parallel work, mention - measurement, common-pool/split overhead, and slower small-list or mostly-small call paths. - - Java 24 bounded blocking calls: use `Gatherers.mapConcurrent` as documented in - [java-stream-api.md](references/java-stream-api.md); no test hooks, overloads, - `CompletableFuture` fan-out, or null sentinels. Call the provided production stub directly and - carry `element + boolean result` through a non-null holder: ```java - records.stream() - .gather(Gatherers.mapConcurrent( - limit, - record -> new Checked<>(record, RemoteService.accepts(record.id())))) - .filter(Checked::accepted) - .map(Checked::value) - .toList(); - - record Checked(T value, boolean accepted) {} + List uppercaseNames(List names) { + return names.stream().map(String::toUpperCase).toList(); + } ``` -3. Flatten nested sources deliberately. Use `flatMap`, `flatMap(Optional::stream)` on Java 9+, - and `mapMulti` on Java 16+ when clearer. For subtype primitives, filter/cast first, then call - `mapToInt`/`mapToLong`/`mapToDouble` directly. - Extract nested `flatMap` callbacks recursively: use `.flatMap(Type::childEntries)`, keep helpers - stream-based rather than temp-list loops, and repeat inside helpers until callbacks no longer - continue stream chains across lines. Apply the same shape to nested `anyMatch` chains and - downstream `Collectors.flatMapping`: use named stream-returning helpers or method references, not - lambdas whose bodies continue a nested stream chain on later lines. See - [stream-examples.md](references/stream-examples.md) for final-shape helper patterns. - - ```java - // Java 9+: flatten Optional values instead of filter(Optional::isPresent).map(Optional::get) - optionals.stream().flatMap(Optional::stream).collect(Collectors.toList()); - ``` -4. Choose accumulation/collectors by result semantics: use `reduce(identity, op)` for immutable - non-primitives, `toMap` with merge behavior and deliberate null-key/value handling, non-null keys - for `groupingBy`, `partitioningBy` for boolean splits, and flattened nested indexes when clearer. - Preserve duplicate-key merge rules, tie handling, null contracts, and map suppliers. Carry - `element + result`, never null sentinels. Extract collector merge helpers when duplicate-key - resolution needs branching, tie-breaking, or more than one comparison; do not hide those rules in - multi-line ternaries or block merge lambdas. -5. Preserve ordering, mutability, short-circuit behavior, and stream/collector semantics. -6. Keep imperative code when it is the clearer boundary for stateful output, checked IO, + - Performance/parallel reviews: show a sequential result-producing snippet first; use the review + output rule for benchmarking, overhead, and workload-size wording. + - Java 24 blocking calls: use bounded `Gatherers.mapConcurrent`; preserve element/result + association in a non-null carrier, call the existing helper/stub, filter by the service result, + and map back to the element. Do not use `parallelStream`, `ForkJoinPool`, futures, or test hooks; + use [stream-examples.md](references/stream-examples.md) for the exact shape. +3. Flatten nested sources deliberately. Use `flatMap`, Java 9+ `flatMap(Optional::stream)`, and + Java 16+ `mapMulti` when clearer. For subtype primitives, filter/cast first, then call + `mapToInt`/`mapToLong`/`mapToDouble`. + + Extract nested `flatMap`, nested `anyMatch`, and downstream `Collectors.flatMapping` callbacks + recursively. Use stream-returning helpers or method references, not lambdas whose bodies continue + stream chains on later lines. See [stream-examples.md](references/stream-examples.md). +4. Choose accumulation/collectors by result semantics: + - immutable non-primitives: `reduce(identity, op)` + - duplicate keys or deliberate null behavior: `toMap` with explicit merge/null handling + - non-null grouping keys: `groupingBy` + - boolean splits: `partitioningBy` + - nested indexes: flattened collectors when clearer + + Preserve duplicate-key merge rules, tie handling, null contracts, and map suppliers. Extract merge + helpers for branching, tie-breaking, or more than one comparison. Do not hide those rules in + multi-line ternaries or block merge lambdas. For simple cheapest/lowest-value merges, prefer + `BinaryOperator.minBy(Comparator.comparing(...))` over an inline ternary. +5. Keep imperative code when it is the clearer boundary for stateful output, checked IO, mutation-heavy logic, or complex early exits. -7. Verify changed branches for empty inputs, one element, duplicates, nulls, ordering, +6. Verify changed branches for empty inputs, one element, duplicates, nulls, ordering, parallel-safety, and baseline compatibility. Run the marker scan from - [hard-stops.md](references/hard-stops.md), fix hits, and re-scan. In scan audits, keep - hard-stop severities: required hits stay required unless explicitly acceptable; also name - acceptable non-primitive reductions such as `BigDecimal.reduce(...)` when present. + [hard-stops.md](references/hard-stops.md), fix hits, and re-scan. In scan audits: + - keep required hits required unless explicitly acceptable; + - give one compliant replacement, not a second marker violation; + - classify acceptable non-primitive reductions such as `BigDecimal.reduce(...)`; + - for Java-version-only audits, report only unavailable APIs and explicitly allowed markers, naming + each post-Java-8 API separately with [java-stream-api.md](references/java-stream-api.md). Generic lambda and callback-style guidance belongs to the companion package `martinfrancois/java-functional-style`. Install both packages when stream cleanup depends on callback -style. When both are available, keep stream callbacks as glue: extract nested callback bodies and -filters with more than one domain condition to named helpers. - -Review output: give a direct behavior-preserving decision plus one safe snippet. For `review.md`, -write short prose first; avoid tables, extra snippets, style sections, redundant-output sections, and -internal workflow labels unless asked. For ordered lookup rejections, include: "Keep the existing -`sorted(...).filter(...).findFirst()` chain." When rejecting `parallelStream().findAny()`, say -parallelism makes ordered selection less predictable or more expensive here and no CPU-bound work or -measurement justifies the overhead. +style. When both are available, keep stream callbacks as glue: extract local temporaries, branching, +date math, record construction, multi-condition filters, nested streams, and downstream +`Collectors.flatMapping` into named helpers or method references; do not leave block lambdas or +multi-line nested stream chains in the pipeline. + +For `review.md`, lead with the technical decision and at most one safe snippet. Use +[review-output.md](references/review-output.md) for case-specific review wording, and never mention +skills, rules, rubrics, criteria, internal paths, or unrelated code issues. For ordered-lookup +rejections, make the safe chain an explicit recommendation, not just a description. Do not mention +`min`/`max` alternatives unless optimization advice is requested. For ascending numeric priority, +say the lowest priority number/value wins; never say "highest-priority" or call the sort no-op. When +the proposal uses `parallelStream().filter(...).findAny()`, separately state that `parallelStream()` +is unjustified without CPU-bound work or measurements and adds ordering/split/merge overhead. Do not +rely on "keep the original code" without naming the ordered chain. diff --git a/skills/java-streams/references/hard-stops.md b/skills/java-streams/references/hard-stops.md index 7033e79..a4e28bc 100644 --- a/skills/java-streams/references/hard-stops.md +++ b/skills/java-streams/references/hard-stops.md @@ -14,7 +14,8 @@ Fix these before finalizing: Use `findFirst` to preserve encounter-order behavior unless the domain explicitly says all matches are equivalent and encounter order does not decide the result. In reviews that compare `findFirst` and `findAny`, name that `findAny` exception explicitly before discussing parallelism - or performance. + or performance. Do not say `findAny` is appropriate merely because the stream is parallel or + unordered. - `filter(...).count() > 0` for existence. Use `anyMatch`. - Plain `count()` is appropriate when the requested result is a numeric count; do not replace it with `anyMatch`. @@ -39,6 +40,8 @@ Fix these before finalizing: helper would explain the rule, such as "undelivered and past due". - `filter(Optional::isPresent).map(Optional::get)` on Java 9+. Use `flatMap(Optional::stream)`. - `toMap` without a merge function when duplicate keys are possible. +- `toMap` over keys that the original loop skipped when null. Preserve the skip-null behavior with + a filter before collecting; do not describe a single null key as a guaranteed exception. - `groupingBy` where null classifier keys can reach the collector. Treat this as a required fix, not a conditional caveat, unless the code already proves non-null before the collector. In scan audits, use unambiguous wording such as "requires fix"; do not downgrade nullable `groupingBy` to @@ -69,9 +72,10 @@ Fix these before finalizing: - `parallelStream()` or `.parallel()` added without checking CPU-bound work, data size, ordering, shared state, blocking calls, and collector safety. - Blocking predicate-like checks that return the original element or `null` as a false sentinel. - Carry the element with an explicit boolean result, then filter and map back to the element. Use - `Map.entry` only on Java 9+ when both values are non-null; otherwise use a null-tolerant holder - such as `AbstractMap.SimpleImmutableEntry` or a project type. + Carry the element with the explicit service result, then filter and map back to the element. Use a + boolean result for boolean-returning services; keep the full decision record when the service + returns one. Use `Map.entry` only on Java 9+ when both values are non-null; otherwise use a + null-tolerant holder such as `AbstractMap.SimpleImmutableEntry` or a project type. - Java-version drift: `toList`, `mapMulti`, `teeing`, `takeWhile`, `dropWhile`, `Optional.stream`, `Collectors.flatMapping`, `Stream.ofNullable`, or gatherers used below their minimum Java version. For a version-drift audit, report these unavailable APIs and explicitly allowed markers only; do @@ -80,6 +84,13 @@ Fix these before finalizing: For below-baseline replacement code, preserve stream semantics with a small loop or helper when no equivalent stream API exists, such as `takeWhile`/`dropWhile` prefixes, downstream flat-mapping, or paired min/max aggregation. + Do not sketch below-baseline replacements that keep a nested stream chain lambda across lines; use + a named helper or plain loop instead. + For Java 8 `Collectors.flatMapping` drift, give the loop/helper replacement as the primary + replacement; do not include a nested `flatMap(... -> ...stream().map(...))` alternative. + For Java 8 `mapMulti` drift, use `map(...)` when the callback emits exactly one value per input, + or `flatMap(...)`/a helper only for real one-to-many emission. Do not invent replacements using + post-Java-8 helpers such as `describeConstable`, `Optional.stream`, or `Map.entry`. Java 8 replacement snippets must not use Java 9+ helpers such as `Map.entry`. When one stream chain contains multiple unavailable APIs, list each unavailable API separately. Example: `flatMap(Optional::stream).toList()` on a Java 8 baseline has two version-drift hits: @@ -103,11 +114,13 @@ markers belong to `java-functional-style`, not this stream hard-stop scan. selection less predictable or more expensive here, and that no CPU-bound work or measurement justifies that overhead. - Use `findAny()` only when the domain explicitly says all matching domain values, such as - primary/default/preferred values, are equivalent and encounter order does not matter. Use + replicated endpoints with identical results, are equivalent and encounter order does not define + which one wins. Use `findFirst()` for first configured, first listed, chronological, priority, fallback, or - user-visible results. In reviews, name the code's noun in the exception, e.g. "all matching - primary addresses are equivalent." Do not present `findAny()` as a performance shortcut unless - that semantic precondition is already satisfied. + user-visible results. In reviews, state the positive exception with the code's noun, e.g. + "`findAny()` would be appropriate only if the domain declares all matching primary addresses + equivalent and encounter order does not define which one wins." Do not present `findAny()` as a + performance shortcut unless that semantic precondition is already satisfied. - `distinct().sorted()` is the required rewrite for `sorted().distinct()` when duplicates can be removed before sorting and no sort-before-de-duplication semantics are required. - `limit(n)` must come after sorting when computing top-N by an ordering. It may come before an @@ -125,9 +138,10 @@ Use parallel streams only after checking: calls with a requested concurrency limit, use `Gatherers.mapConcurrent` with that bound when the baseline supports it and virtual-thread concurrency is the intended design. Preserve element/result association explicitly with a baseline-compatible holder rather than null sentinels - or side maps. Do not replace bounded stream concurrency with unbounded `CompletableFuture` - fan-out. For remote calls, call out the concurrency limit, timeout handling for slow calls, and - error propagation/retry policy. + or side maps. Do not add wrapper hooks, overloads, delegates, futures, sentinels, caches, or + retries unless requested. Do not replace bounded stream concurrency with unbounded + `CompletableFuture` fan-out. For remote calls, call out the concurrency limit, timeout handling + for slow calls, and error propagation/retry policy. 5. The stream terminal operation or collector is safe under parallel execution. For acceptable CPU-heavy parallel streams, state that the benefit should be measured or benchmarked diff --git a/skills/java-streams/references/review-output.md b/skills/java-streams/references/review-output.md new file mode 100644 index 0000000..09e47da --- /dev/null +++ b/skills/java-streams/references/review-output.md @@ -0,0 +1,57 @@ +# Stream Review Output + +Use these rules only for user-facing stream reviews. Keep the final artifact focused on the Java +behavior, not the workflow that found it. + +## Ordered Lookup + +If a proposal replaces `sorted(...).filter(...).findFirst()`, reject it and explicitly recommend: +"Keep the existing `sorted(...).filter(...).findFirst()` chain." This must be a direct recommendation, +not only an implied conclusion. Also state the concrete ordering contract. For ascending numeric +priority, say the lowest priority number/value wins; do not say "highest-priority" unless the code +uses an explicit reversed comparator. Do not call the sort no-op and do not suggest `findAny`, `min`, +`max`, or collectors unless the task asks for optimization advice. + +If that proposal also uses `parallelStream().filter(...).findAny()`, make the parallelism point a +separate sentence: `parallelStream()` is unjustified without CPU-bound work or measurements and adds +split/merge/order overhead for this lookup. + +If the ordered source is a loop over configured order, recommend keeping the loop or using +`contacts.stream().filter(...).findFirst()`; do not mention a sorted chain. + +## `findFirst` And `findAny` + +`findAny()` is valid only when the domain declares all matching values equivalent and order +irrelevant. Do not infer this from names like `primary`, `default`, or `preferred`, or from +parallelism. State this semantic exception before performance or nondeterminism. + +For primary-address examples, use: +"`findAny()` is appropriate only when all matching primary addresses are considered equivalent and +encounter order does not matter." + +When asked whether to keep collecting, use `findFirst`, or use `findAny`, include that collecting +all matches is unnecessary, `findFirst()` preserves existing `get(0)` encounter order, and +`findAny()` needs domain-guaranteed equivalence. + +## Java Version + +When rejecting `Stream.toList()` on Java 11 or lower, state the baseline incompatibility as its own +problem: "`Stream.toList()` was added in Java 16, so it is unavailable on Java 11." Then separately +state any mutability problem if later code calls `add`, `sort`, or otherwise mutates the result. + +## Performance And Parallelism + +Include benchmarking, small/mostly-small slowdown, and relevant common-pool, blocking, or ordering +risk. For high-volume uppercase/list-transform reviews, say `parallelStream()` can be slower for +small lists or call paths where most invocations are small, so benchmark the real workload before +using it for throughput. + +## Nullable Collectors + +If the original loop skipped null keys, say the proposed collector changes behavior by retaining or +colliding on null keys unless filtered first. Do not claim a single null key always throws; the +required fix is preserving the skip-null contract before collecting. + +## Remote Calls + +For `mapConcurrent` snippets, preserve approved-only sorted output and call the existing helper. diff --git a/skills/java-streams/references/stream-examples.md b/skills/java-streams/references/stream-examples.md index 39349ee..e5aab21 100644 --- a/skills/java-streams/references/stream-examples.md +++ b/skills/java-streams/references/stream-examples.md @@ -7,7 +7,8 @@ using APIs from [java-stream-api.md](java-stream-api.md). ## Direct Terminals - Arbitrary match only when the domain explicitly says all matching domain values, such as - primary/default/preferred values, are equivalent and encounter order does not matter: + replicated endpoints with identical results, are equivalent and encounter order does not define + which one wins: ```java Address selectedAddress = account.getAddresses().stream() @@ -18,11 +19,13 @@ using APIs from [java-stream-api.md](java-stream-api.md). When a review asks whether to use `findFirst` or `findAny`, answer the semantic question first: `findAny` is valid only if the domain explicitly says all matching domain values, such as -`primary`, `default`, or `preferred` values, are equivalent and encounter order does not select the -winner. For those names, write the exception with the code's noun, such as "all matching primary -addresses are equivalent." Do not infer equivalence from the name; if existing code takes element -`0`, preserve encounter order with `findFirst()`. Do not lead with performance or parallel-stream -wording for `findAny`. +replicated endpoints with identical results, are equivalent and encounter order does not define +which one wins. Do not infer equivalence or uniqueness from names like `primary`, `default`, or +`preferred`; for those names, write the positive exception with the code's noun, such as +"`findAny()` would be appropriate only if the domain declares all matching primary addresses +equivalent and encounter order does not define which one wins." If existing code takes element `0`, +preserve encounter order with `findFirst()`. Do not lead with performance or parallel-stream wording +for `findAny`. - First fallback when order matters: @@ -122,6 +125,8 @@ Pitfall: `Stream.toList()` returns an unmodifiable list. Keep `Collectors.toList `Collectors.toCollection(ArrayList::new)` when the result is sorted, appended to, or otherwise mutated later. If a task says `Stream.toList()` is not valid for that mutable result, do not wrap it in `new ArrayList<>(...)`; use the mutable collector directly. +On Java 11 or lower, reject `Stream.toList()` as a separate Java-version problem before discussing +mutability: `Stream.toList()` was added in Java 16, so it is unavailable on Java 11. ## External Mutation @@ -232,6 +237,10 @@ If a `groupingBy` classifier can return null, filter nulls or map them to an exp before collecting. Do not treat possible null classifier keys as acceptable without proof. If a loop skipped null keys before building a map, filter those null keys before `toMap`; the issue is the behavior change, not a guaranteed null-key `NullPointerException`. +When the loop keeps the cheapest or lowest-valued element per key with a comparator, use +`BinaryOperator.minBy(Comparator.comparing(...))`; it keeps the existing value on comparator ties in +a `toMap` merge because the existing value is the first merge argument. Do not replace this simple +merge with an inline ternary. When a review includes a collector replacement snippet, include the supporting imports for helper APIs such as `Function.identity()` or `BinaryOperator.minBy(...)`, and avoid tangential import commentary unless missing imports are part of the stream issue. @@ -241,7 +250,7 @@ in the collector: ```java Map longestSessionByRoom = conferences.stream() - .flatMap(conference -> conference.sessions().stream()) + .flatMap(SessionIndexes::sessions) .filter(session -> session.room() != null) .collect(Collectors.toMap( Session::room, @@ -256,6 +265,10 @@ encounter order must decide equal values. private static Session longerSession(Session first, Session second) { return first.minutes() >= second.minutes() ? first : second; } + +private static Stream sessions(Conference conference) { + return conference.sessions().stream(); +} ``` For nested `flatMap` callbacks, extract the full flattened element shape. Do not leave a callback @@ -275,7 +288,7 @@ values. The collector should read as stream glue: ```java Map> emailsByTrack = conferences.stream() - .flatMap(conference -> conference.sessions().stream()) + .flatMap(SessionIndexes::sessions) .filter(session -> session.track() != null) .collect(Collectors.groupingBy( Session::track, @@ -299,7 +312,7 @@ but do not stack multi-line callbacks: ```java boolean hasWaitlistedRegistration(List conferences) { return conferences.stream() - .flatMap(conference -> conference.sessions().stream()) + .flatMap(SessionIndexes::sessions) .anyMatch(SessionIndexes::hasWaitlistedRegistration); } @@ -413,8 +426,10 @@ record ScheduleCheck(Job job, boolean accepted) {} `Map.entry` is appropriate in this example because the baseline is Java 24 and neither side of the entry is null. Do not return `null` from a `mapConcurrent` mapper to mean "skip"; carry the element -with an explicit boolean result, then filter and map afterward. If nulls can reach the carrier or -the baseline is Java 8, use a null-tolerant project type or `AbstractMap.SimpleImmutableEntry`. +with the service result, then filter and map afterward. If the service returns a decision record, +keep that full record in the carrier instead of reducing it to a boolean too early. If nulls can +reach the carrier or the baseline is Java 8, use a null-tolerant project type or +`AbstractMap.SimpleImmutableEntry`. When the task provides a production service stub such as `AvailabilityApi.lookup(...)` or `CalendarService.canSchedule(...)`, call that stub directly. Do not add delegate fields, alternate overloads, package-private switches, or other test hooks unless the task explicitly asks for them.