|
| 1 | +## Issue #121: Add JsonPath module with AST parser and evaluator |
| 2 | + |
| 3 | +### Goal |
| 4 | + |
| 5 | +Add a new Maven module that implements **JsonPath** as described by Stefan Goessner: |
| 6 | +`https://goessner.net/articles/JsonPath/index.html`. |
| 7 | + |
| 8 | +This module must: |
| 9 | + |
| 10 | +- Parse a JsonPath string into a **custom AST** |
| 11 | +- Evaluate that AST against a JSON document already parsed by this repo’s |
| 12 | + `jdk.sandbox.java.util.json` API |
| 13 | +- Have **no runtime dependencies** outside `java.base` (and the existing `json-java21` module) |
| 14 | +- Target **Java 21** and follow functional / data-oriented style (records + sealed interfaces) |
| 15 | + |
| 16 | +### Non-goals |
| 17 | + |
| 18 | +- Parsing JSON documents from strings (the core `Json.parse(...)` already does that) |
| 19 | +- Adding non-trivial external dependencies (no regex engines beyond JDK, no parser generators) |
| 20 | +- Supporting every JsonPath dialect ever published; the baseline is the article examples |
| 21 | + |
| 22 | +### Public API shape (module `json-java21-jsonpath`) |
| 23 | + |
| 24 | +Package: `json.java21.jsonpath` |
| 25 | + |
| 26 | +- `JsonPathExpression JsonPath.parse(String path)` |
| 27 | + - Parses a JsonPath string into an AST-backed compiled expression. |
| 28 | +- `List<JsonValue> JsonPathExpression.select(JsonValue document)` |
| 29 | + - Evaluates the expression against a parsed JSON document and returns matched nodes in traversal order. |
| 30 | + |
| 31 | +Notes: |
| 32 | +- The return type is a `List<JsonValue>` to avoid introducing a JSON encoding of “result sets”. |
| 33 | + Callers can wrap it into `JsonArray.of(...)` if desired. |
| 34 | + |
| 35 | +### AST plan |
| 36 | + |
| 37 | +Use a sealed interface with records (no inheritance trees with stateful objects): |
| 38 | + |
| 39 | +- `sealed interface PathNode permits Root, StepChain` |
| 40 | +- `record Root() implements PathNode` |
| 41 | +- `record StepChain(PathNode base, Step step) implements PathNode` |
| 42 | + |
| 43 | +Steps are a separate sealed protocol: |
| 44 | + |
| 45 | +- `sealed interface Step permits Child, RecursiveDescent, Wildcard, ArrayIndex, ArraySlice, Union, Filter` |
| 46 | +- `record Child(Name name)` where `Name` is either identifier or quoted key |
| 47 | +- `record RecursiveDescent(Step selector)` where selector is `Child` or `Wildcard` |
| 48 | +- `record Wildcard()` |
| 49 | +- `record ArrayIndex(int index)` supports negative indices per examples |
| 50 | +- `record ArraySlice(Integer start, Integer end, Integer step)` to cover `[:2]`, `[-1:]`, etc. |
| 51 | +- `record Union(List<Step> selectors)` for `[0,1]`, `['a','b']` |
| 52 | +- `record Filter(PredicateExpr expr)` for `[?(...)]` |
| 53 | + |
| 54 | +Filter expressions: |
| 55 | + |
| 56 | +- Keep a minimal expression AST that supports the article examples: |
| 57 | + - `@.field` access |
| 58 | + - `@.length` pseudo-property for array length |
| 59 | + - numeric literals |
| 60 | + - string literals |
| 61 | + - comparison operators: `<`, `<=`, `>`, `>=`, `==`, `!=` |
| 62 | + - arithmetic: `+`, `-` (only what’s needed for `(@.length-1)`) |
| 63 | + |
| 64 | +### Parser plan |
| 65 | + |
| 66 | +Hand-rolled scanner + recursive descent parser: |
| 67 | + |
| 68 | +- Lex JsonPath into tokens (`$`, `.`, `..`, `[`, `]`, `*`, `,`, `:`, `?(`, `)`, identifiers, quoted strings, numbers, operators). |
| 69 | +- Parse according to the article grammar: |
| 70 | + - Root `$` must appear first |
| 71 | + - Dot-notation steps: `.name`, `.*`, `..name`, `..*` |
| 72 | + - Bracket steps: |
| 73 | + - `['name']` |
| 74 | + - `[0]`, `[-1]` |
| 75 | + - `[0,1]`, `['a','b']` |
| 76 | + - `[:2]`, `[2:]`, `[-1:]` |
| 77 | + - `[?(...)]` |
| 78 | + - `[(...)]` for script expressions used as array index in examples (limited support) |
| 79 | + |
| 80 | +### Evaluator plan |
| 81 | + |
| 82 | +Evaluator is a pure function over immutable inputs, implemented as static methods: |
| 83 | + |
| 84 | +- Maintain a worklist of “current nodes” (starting with the document root). |
| 85 | +- For each step: |
| 86 | + - **Child**: for objects, pick member by key; for arrays, apply to each element if they’re objects (per example behavior). |
| 87 | + - **RecursiveDescent**: walk the subtree of each current node (object members + array elements) and apply the selector to every node encountered. |
| 88 | + - **Wildcard**: for objects select all member values; for arrays select all elements. |
| 89 | + - **ArrayIndex / Slice / Union**: apply only when the current node is an array. |
| 90 | + - **Filter**: apply only when current node is an array; keep elements where predicate is true. |
| 91 | + |
| 92 | +Ordering: |
| 93 | +- Preserve traversal order implied by iterating object members (`JsonObject.members()` is order-preserving) and array elements order. |
| 94 | + |
| 95 | +### Tests (TDD baseline) |
| 96 | + |
| 97 | +Add tests that correspond 1:1 with every example on the article page: |
| 98 | + |
| 99 | +- Use the article’s sample document (embedded as a Java text block) and parse with `Json.parse(...)`. |
| 100 | +- Assertions check matched values by rendering to JSON (`JsonValue.toString()`) and comparing to expected fragments. |
| 101 | +- Every test method logs an INFO banner at start (common base class). |
| 102 | + |
| 103 | +### Verification |
| 104 | + |
| 105 | +Run focused module tests with logging: |
| 106 | + |
| 107 | +```bash |
| 108 | +$(command -v mvnd || command -v mvn || command -v ./mvnw) -pl json-java21-jsonpath test -Djava.util.logging.ConsoleHandler.level=FINE |
| 109 | +``` |
| 110 | + |
| 111 | +Run full suite once stable: |
| 112 | + |
| 113 | +```bash |
| 114 | +$(command -v mvnd || command -v mvn || command -v ./mvnw) test -Djava.util.logging.ConsoleHandler.level=INFO |
| 115 | +``` |
| 116 | + |
0 commit comments