superseded release prep draft#1
Conversation
9c77e4d to
cb38f50
Compare
cb38f50 to
a24fafe
Compare
a24fafe to
a0db687
Compare
Greptile SummaryThis PR prepares the v1.0.0 public snapshot of srcwalk, replacing the legacy flag-based CLI with an intent-first subcommand design (
Confidence Score: 4/5Safe to merge for the large majority of use cases; one path in the new access search accumulates all matches into memory before output is bounded. The CLI restructure, diff parser, review flow-map pipeline, and compare command are all well-guarded with explicit caps and fallback paths. The only live defect is in search_access: when a user omits --limit, every matched hit is cloned into a single vector and fully formatted before the budget truncation runs, which can cause significant memory pressure for common identifiers in large repos. Every comparable command in this PR applies a default limit before formatting; this one does not. src/search/access.rs — the default page_size path; src/commands/diff.rs — quoted-path parsing edge case in parse_diff_git_paths Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[CLI input] --> B{subcommand?}
B --> C[show / root path]
B --> D[discover]
B --> E[review]
B --> F[compare]
B --> G[trace callers/callees]
B --> H[deps / assess / overview]
D --> D1{--as / --match}
D1 --> D2[symbol search]
D1 --> D3[text search]
D1 --> D4[text-OR multi-term]
D1 --> D5[file glob]
D1 --> D6[access pattern search]
E --> E1{target type}
E1 --> E2[local review run_local_review]
E1 --> E3[change review run_change_review]
E3 --> E4[collect_diff_evidence git diff / ls-files]
E4 --> E5[attach_symbols outline + tree-sitter]
E3 --> E6[append_change_flow_maps render_flow_map x MAX_FLOW_MAPS=5]
F --> F1[resolve_compare_target x 2]
F1 --> F2[compare_features AST walk]
D6 --> D6a[collect_access_hits parallel walker]
D6a --> D6b[format_access_result page_size = limit OR usize::MAX]
Reviews (1): Last reviewed commit: "feat: prepare v1.0.0 public update" | Re-trigger Greptile |
| }); | ||
|
|
||
| let total_found = hits.len(); |
There was a problem hiding this comment.
When --limit is omitted, page_size becomes usize::MAX and every matched AccessHit is cloned into shown_hits, then fully formatted into a single String before any budget truncation is applied. For a frequently-used identifier in a large codebase (e.g. count, err, result) this can allocate hundreds of megabytes of intermediate data. Every comparable command in this PR has a sensible default — run_diff defaults to 20, text-OR caps per-term results at DEFAULT_TEXT_OR_TERM_LIMIT (10) — but search_access has none, violating the bounded-agent-output custom rule. A default cap (e.g. 50 or 100) should be used when no --limit is provided.
Rule Used: CLI output shown to users or agents should be boun... (source)
| if let Some(rest) = rest.strip_prefix("\"a/") { | ||
| let (old_path, new_path) = rest.rsplit_once("\" \"b/")?; | ||
| return Some(( | ||
| Some(clean_patch_path(old_path)), | ||
| clean_patch_path(new_path.trim_end_matches('"')), | ||
| )); | ||
| } |
There was a problem hiding this comment.
rsplit_once misparses diff --git header for quoted paths containing b/
parse_diff_git_paths handles the quoted-path form ("a/..." "b/...") but if the filename contains the literal substring " "b/, the split point will be wrong. This can silently produce an incorrect path for the resulting DiffFile, causing the scope filter to skip the file entirely.
| if let Some(rest) = rest.strip_prefix("\"a/") { | |
| let (old_path, new_path) = rest.rsplit_once("\" \"b/")?; | |
| return Some(( | |
| Some(clean_patch_path(old_path)), | |
| clean_patch_path(new_path.trim_end_matches('"')), | |
| )); | |
| } | |
| if let Some(rest) = rest.strip_prefix("\"a/") { | |
| // Split on the last `" "b/` to correctly handle paths that themselves contain ` b/` | |
| let (old_path, new_path) = rest.rsplit_once("\" \"b/")?; | |
| let old_path = clean_patch_path(old_path.trim_end_matches('\\')); | |
| let new_path = clean_patch_path(new_path.trim_end_matches('"')); | |
| return Some((Some(old_path), new_path)); | |
| } |
Greptile SummaryThis PR prepares the v1.0.0 public snapshot of srcwalk by introducing an intent-first CLI surface (
Confidence Score: 4/5Safe to merge; the two findings are maintenance concerns in newly added code, not regressions in existing behavior. The fallback detection in run_local_review ties correctness to the exact wording of two error messages in decision_flow.rs, so a future message edit could silently break the graceful fallback. The untracked-file line-count reads entire files into memory, which could be expensive on large repos. Both are confined to new code paths; the validated test suite and CI matrix give reasonable confidence. src/commands/review.rs (fallback error detection), src/commands/diff.rs (untracked file line-count) Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
CLI[srcwalk CLI input] --> Router{cli_run::run}
Router -->|show path:range| Show[run_show]
Router -->|discover query| Discover{Mode?}
Router -->|context target| Flow[run_flow_with_artifact]
Router -->|trace callers/callees| Trace[run_callers/callees]
Router -->|review range| Review{Review type?}
Router -->|compare A B| Compare[run_compare]
Router -->|overview| Overview[run_map]
Discover -->|symbol| Symbol[run_filtered]
Discover -->|text| Text[run_text_filtered]
Discover -->|file| Files[run_files]
Discover -->|access| Access[run_access_filtered]
Review -->|local target| LocalReview[run_local_review]
Review -->|rev range/staged| ChangeReview[run_change_review]
LocalReview --> Budget[apply_review_budget]
ChangeReview --> Budget
Budget --> Output[emit_result]
Reviews (1): Last reviewed commit: "feat: prepare v1.0.0 public update" | Re-trigger Greptile |
| fn is_flow_map_fallback_error(err: &SrcwalkError) -> bool { | ||
| matches!( | ||
| err, | ||
| SrcwalkError::InvalidQuery { reason, .. } | ||
| if reason.contains("requires tree-sitter source support") | ||
| || reason.contains("currently supports Rust") | ||
| ) | ||
| } |
There was a problem hiding this comment.
is_flow_map_fallback_error couples review.rs to the exact wording of error messages in decision_flow.rs via substring matching. The check reason.contains("currently supports Rust") would silently stop matching if the supported-language list is reworded or reordered, causing run_local_review to propagate a hard error instead of falling back to file-level evidence.
| fn is_flow_map_fallback_error(err: &SrcwalkError) -> bool { | |
| matches!( | |
| err, | |
| SrcwalkError::InvalidQuery { reason, .. } | |
| if reason.contains("requires tree-sitter source support") | |
| || reason.contains("currently supports Rust") | |
| ) | |
| } | |
| fn is_flow_map_fallback_error(err: &SrcwalkError) -> bool { | |
| matches!( | |
| err, | |
| SrcwalkError::InvalidQuery { reason, .. } | |
| if reason.contains("requires tree-sitter source support") | |
| || reason.contains("currently supports") | |
| ) | |
| } |
| let line_count = | ||
| std::fs::read_to_string(&full_path).map_or(0, |content| content.lines().count() as u32); |
There was a problem hiding this comment.
untracked_files reads the entire file content just to count newlines. For large untracked text files this is O(file size) in memory and time; a byte-buffer scan or sentinel value would avoid pulling large files into memory.
| let line_count = | |
| std::fs::read_to_string(&full_path).map_or(0, |content| content.lines().count() as u32); | |
| let line_count = count_lines_cheap(&full_path); |
Superseded by #2.