Skip to content

Add eval for bounded duplicate-detection streams #53

Description

@martinfrancois

Problem

The Java Streams skill should learn a safe stream shape for "zero, exactly one, or ambiguous duplicate" lookup methods. A simple findFirst() would be wrong because duplicate matches must fail closed. A full filter(...).toList() works but scans and stores every duplicate when only two matches are needed to decide ambiguity.

Code before the prompt was executed

The code used manual sentinel loops to find a matching checklist or checklist item. The loops were correct, but they repeated the same state-machine shape twice.

private static Card.Checklist singleChecklistByName(List<Card.Checklist> checklists, String checklistName) {
    Card.Checklist match = null;
    for (Card.Checklist checklist : checklists) {
        if (!Objects.equals(checklist.name(), checklistName)) {
            continue;
        }
        if (match != null) {
            throw new TrelloException(
                    "trello_checklist_ambiguous", "Multiple Trello checklists match the requested checklist_name.");
        }
        match = checklist;
    }
    return match;
}

private static Card.ChecklistItem singleCheckItemByName(Card.Checklist checklist, String itemName) {
    Card.ChecklistItem match = null;
    for (Card.ChecklistItem item : checklist.items()) {
        if (!Objects.equals(item.text(), itemName)) {
            continue;
        }
        if (match != null) {
            throw new TrelloException(
                    "trello_check_item_ambiguous",
                    "Multiple Trello checklist items match the requested item_name.");
        }
        match = item;
    }
    return match;
}

Prompt that caused the implementation

The original implementation was part of a bugfix prompt for policy-enabled Trello card tools. The checklist tool had to reject ambiguous checklist and checklist-item matches before making a Trello mutation.

Later prompt that exposed the issue

A PR review comment asked:

Is there a possibility to simplify the logic by refactoring to a stream?

Prompt-produced code before maintainer correction

The reviewed code was the sentinel-loop code above. It was functionally correct, but it duplicated the "remember one match, fail on second match" logic.

Why the prompt-produced code is weak

The manual loops are not incorrect. The weakness is that they obscure the actual data operation: filter matching names, inspect at most two matches, return none/one, or fail on ambiguity. Because the code only needs to distinguish zero, one, and more-than-one, scanning past the second match is unnecessary.

Behavior-equivalence analysis

The replacement preserves null-safe Objects.equals(...) matching, encounter order for the single selected match, no-match returning null, and duplicate-match exception behavior. The .limit(2) is important: it keeps enough data to detect ambiguity without collecting all matches. A plain findFirst() would be behavior-changing and must be rejected because it would silently accept duplicates.

Maintainer-preferred code

private static Card.Checklist singleChecklistByName(List<Card.Checklist> checklists, String checklistName) {
    List<Card.Checklist> matches = checklists.stream()
            .filter(checklist -> Objects.equals(checklist.name(), checklistName))
            .limit(2)
            .toList();
    return switch (matches.size()) {
        case 0 -> null;
        case 1 -> matches.getFirst();
        default -> throw new TrelloException(
                "trello_checklist_ambiguous", "Multiple Trello checklists match the requested checklist_name.");
    };
}

private static Card.ChecklistItem singleCheckItemByName(Card.Checklist checklist, String itemName) {
    List<Card.ChecklistItem> matches = checklist.items().stream()
            .filter(item -> Objects.equals(item.text(), itemName))
            .limit(2)
            .toList();
    return switch (matches.size()) {
        case 0 -> null;
        case 1 -> matches.getFirst();
        default -> throw new TrelloException(
                "trello_check_item_ambiguous", "Multiple Trello checklist items match the requested item_name.");
    };
}

Why the replacement is better

The stream expresses the lookup directly and bounds the work at the second match. The switch makes the three domain states explicit: missing, exactly one, and ambiguous. The resulting code remains behavior-preserving while removing duplicated sentinel state.

Desired eval behavior

  • Reward bounded duplicate detection with filter(...).limit(2).toList() followed by an explicit zero/one/ambiguous branch.
  • Reward explaining why findFirst() is wrong when duplicates must fail closed.
  • Reward preserving encounter order for the exactly-one case.
  • Reward preserving the existing exception code and message.
  • Reward identifying that collecting all matches is unnecessary when two matches are sufficient.

Anti-patterns the eval should reject

  • Replacing the loop with findFirst() and silently accepting duplicates.
  • Collecting all matches when the code only needs to distinguish zero, one, and at least two.
  • Throwing a generic exception instead of the existing stable domain exception.
  • Hiding the duplicate branch in a generic helper without preserving error code/message clarity.

Suggested eval name

bounded-duplicate-detection-stream

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions