Skip to content

feat(types): extend FoundString data model with non_exhaustive and builder pattern#123

Merged
unclesp1d3r merged 10 commits intomainfrom
enhance_foundstring_model
Jan 18, 2026
Merged

feat(types): extend FoundString data model with non_exhaustive and builder pattern#123
unclesp1d3r merged 10 commits intomainfrom
enhance_foundstring_model

Conversation

@unclesp1d3r
Copy link
Member

@unclesp1d3r unclesp1d3r commented Jan 18, 2026

Summary

This PR extends the FoundString data model to support symbol demangling preservation and debug score breakdowns, completing the foundational ticket for the enhanced string analysis pipeline.

Impact: 27 files changed (+592 additions, -126 deletions)
Risk Level: Low - additive changes with backwards compatibility
Review Time: ~15-20 minutes


What Changed

Source Changes

File Change Type Description
src/types.rs Modified Added #[non_exhaustive], new fields, constructor, and builder pattern
src/extraction/*.rs Modified Documentation updates for code block formatting
src/classification/*.rs Modified Documentation updates for code block formatting

Test Changes

File Change Type Description
tests/test_deduplication.rs Modified Updated to use new constructor/builder pattern
tests/classification_integration.rs Modified Updated to use new constructor/builder pattern

Documentation Changes

File Change Type Description
docs/src/*.md Modified Updated code blocks to use text formatting
project_plan/tickets/*.md Modified Marked acceptance criteria as completed

Configuration Changes

File Change Type Description
.github/prompts/cicheck.prompt.md Added CI check automation prompt
.serena/project.yml Added Serena project configuration

Why These Changes

1. API Stability (#[non_exhaustive])

The FoundString struct is part of the public API. Adding #[non_exhaustive] ensures:

  • External code cannot use struct literal syntax
  • New fields can be added without breaking downstream consumers
  • Complies with AGENTS.md guidelines for public structs

2. Demangling Support (original_text)

The original_text: Option<String> field enables:

  • Preserving mangled C++/Rust symbol names before demangling
  • Users can see both human-readable and original forms
  • Required for the demangling pipeline in future tickets

3. Debug Score Breakdown

Three new optional fields provide score transparency:

  • section_weight: Option<i32> - Weight from section location
  • semantic_boost: Option<i32> - Boost from semantic classification
  • noise_penalty: Option<i32> - Penalty for noise characteristics

4. Builder Pattern

The new constructor and builder methods provide:

  • Type-safe construction without struct literals
  • Fluent API for setting optional fields
  • Clear documentation of required vs optional fields

Type of Change

  • New feature (non-breaking change adding functionality)
  • Documentation update
  • Bug fix
  • Breaking change
  • Performance improvement

Technical Implementation

New FoundString API

// Constructor with required fields only
let fs = FoundString::new(
    text,       // String - the extracted text
    encoding,   // Encoding - ASCII, UTF-8, UTF-16, etc.
    offset,     // u64 - file offset
    length,     // u32 - byte length
    source,     // StringSource - where it came from
);

// Builder pattern for optional fields
let fs = FoundString::new(text, encoding, offset, length, source)
    .with_rva(0x1000)
    .with_section(".rodata".to_string())
    .with_score(10)
    .with_confidence(0.9)
    .with_original_text("_ZN4test8functionE".to_string())
    .with_section_weight(50)
    .with_semantic_boost(25)
    .with_noise_penalty(-10);

Serde Behavior

Optional fields use #[serde(skip_serializing_if = "Option::is_none", default)]:

  • Fields are omitted from JSON when None
  • Fields default to None when missing during deserialization
  • Backwards compatible with existing JSON data

How Has This Been Tested?

Unit Tests

New tests in src/types.rs:

  • test_found_string_serde_optional_fields_none - Verifies None fields are omitted
  • test_found_string_serde_optional_fields_some - Verifies Some fields are included
  • test_found_string_serde_roundtrip - Full serialization roundtrip
  • test_found_string_deserialize_missing_optional_fields - Backwards compatibility

Integration Tests

Updated existing tests to use new API:

  • tests/classification_integration.rs - Uses FoundString::new()
  • tests/test_deduplication.rs - Uses builder pattern

CI Validation

All 343 tests passing
cargo fmt -- --check (passed)
cargo clippy -- -D warnings (passed)
cargo test (passed)

Breaking Changes

None - This PR maintains full backwards compatibility:

  1. Existing code using FoundString fields continues to work
  2. Existing JSON serialization/deserialization continues to work
  3. New fields default to None when not provided

Note: External code using struct literal syntax will need to migrate to the constructor/builder pattern. This is intentional to ensure future field additions don't break downstream code.


Risk Assessment

Factor Score Details
Size 2/10 Large line count but mostly documentation/formatting
Complexity 2/10 Simple additive changes, builder pattern is straightforward
Test Coverage 1/10 Comprehensive tests added, all existing tests pass
Dependencies 1/10 No new dependencies
Security 1/10 No security implications

Overall Risk: Low (1.4/10)


Review Checklist

General

  • Code follows project style guidelines (AGENTS.md)
  • Self-review completed
  • Comments added for complex logic
  • No debugging code left
  • No sensitive data exposed

Code Quality

  • No code duplication
  • Functions are focused and small
  • Variable names are descriptive
  • #[must_use] on builder methods

Testing

  • All new code is covered by tests
  • Tests are meaningful and not just for coverage
  • Edge cases are tested (None/Some, missing fields)

Documentation

  • Documentation is clear and accurate
  • Examples are provided in doc comments
  • API changes are documented

Rust-Specific

  • No unsafe code
  • No clippy warnings
  • Proper serde attributes for optional fields

Dependencies

This PR is the foundational ticket that other work depends on:

  • Demangling pipeline (will use original_text)
  • Debug mode CLI flag (will use score breakdown fields)
  • Ranking system (will populate score breakdown fields)

Files Changed Summary

src/types.rs                          +247 lines  (core changes)
tests/test_deduplication.rs            -22 lines  (use new API)
tests/classification_integration.rs    -13 lines  (use new API)
docs/src/*.md                          ~60 lines  (formatting)
.github/prompts/*.md                   +25 lines  (new prompts)
.serena/*                              +90 lines  (config)

Generated with Claude Code

unclesp1d3r and others added 8 commits January 17, 2026 21:55
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Introduced `original_text` field to preserve pre-demangled text.
- Updated related extraction and classification modules to initialize this field.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Added unit tests to verify serialization and deserialization of the FoundString struct, focusing on the handling of optional fields (original_text, section_weight, semantic_boost, noise_penalty). Updated the related project plan ticket to mark acceptance criteria as complete. Also added .serena project configuration files.
Add #[non_exhaustive] attribute to FoundString struct to comply with
AGENTS.md guidelines for public API structs. This prevents external
code from using struct literal syntax, ensuring forward compatibility
when new fields are added.

Add constructor and builder methods:
- new() constructor with required fields
- with_rva(), with_section(), with_tags(), with_score()
- with_confidence(), with_original_text()
- with_section_weight(), with_semantic_boost(), with_noise_penalty()

Update test files to use the new constructor/builder pattern.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 18, 2026 03:50
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 18, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Pull Request Summary: Extend FoundString Data Model with Non-Exhaustive and Builder Pattern

Architectural & API Changes

  • FoundString is marked #[non_exhaustive] to allow safe future extension and to discourage external struct-literal construction.
  • Adds a required-field constructor:
    • FoundString::new(text: String, encoding: Encoding, offset: u64, length: u32, source: StringSource) -> Self
  • Adds fluent, #[must_use] builder-style setters for optional metadata:
    • existing: with_rva, with_section, with_tags, with_score, with_confidence
    • new: with_original_text, with_section_weight, with_semantic_boost, with_noise_penalty
  • New optional fields on FoundString:
    • original_text: Option — preserves pre-demangled symbol text
    • section_weight: Option — section contribution to score (debug)
    • semantic_boost: Option — semantic classifier boost component (debug)
    • noise_penalty: Option — noise penalty component (debug)
  • Serde: each new optional field uses skip_serializing_if = "Option::is_none" and default, so None is omitted and missing fields deserialize to None.
  • Migration note: callers that used struct literal construction must migrate to FoundString::new(...) + builder chain; deserialization and code that treats FoundString as opaque remain compatible.

Binary Format & Extraction Impact

  • All extraction paths and insertion sites updated to use the new constructor/builder and to carry the new optional fields (set to None by default) across:
    • ASCII and UTF-16 extractors (ascii.rs, utf16.rs)
    • Deduplication / canonicalization (dedup.rs)
    • Mach‑O load-command extraction (macho_load_commands.rs)
    • PE resource extractors (pe_resources.rs)
    • extraction/mod.rs and related modules
  • No extraction algorithms, scoring formulas, or control flow were modified. The change is structural: additional optional metadata accompanies each FoundString without altering default behavior.

Semantic Classification & Debugging Accuracy

  • original_text preserves mangled input so demangling operations can be audited and correlated back to source symbols — improves traceability.
  • section_weight, semantic_boost, and noise_penalty expose per-component contributions to the final score, enabling transparent debugging, tuning, and validation of classification heuristics.
  • Classification logic remains unchanged; these fields are instrumentation for inspection. They do not alter runtime classification unless consumers explicitly read or act on them.

Backwards Compatibility & Serialization

  • Backward-compatible for consumers that deserialize or use provided constructors: serialized outputs remain compact (new fields omitted when None) and deserializers are tolerant (missing fields => None).
  • Breaking only for callers that constructed FoundString via struct literal syntax; such code must migrate to the new constructor + builder pattern.

Tests, CI & Surface Area

  • Tests updated to construct FoundString via new constructor + builder (tests/test_deduplication.rs, tests/classification_integration.rs).
  • New unit tests verify serde behavior (skip when None, present when Some) and full round-trip serialization.
  • CI workflow modified (ci-gate aggregation job); documentation and project config updated (.github, docs/, project_plan/, .serena/*).
  • Primary implementation and tests are in src/types.rs (constructor, builders, fields, serde tests); multiple extraction modules updated to initialize new fields.
  • Aggregate impact: 27 files changed, +592 / -126 (per PR description).

Walkthrough

This PR expands the public data model (FoundString) with four optional metadata fields and builder APIs, updates all extraction and test sites to use the new constructors/fields, adds ContainerInfo constructor fields, introduces CI/prompts and a Serena project config, and converts many documentation code fences from rust/json to text.

Changes

Cohort / File(s) Summary
Core Types & Constructors
src/types.rs
Adds/exports new fields on FoundString: original_text: Option<String>, section_weight: Option<i32>, semantic_boost: Option<i32>, noise_penalty: Option<i32>; marks struct #[non_exhaustive]; adds FoundString::new and fluent with_* builders, serde skip-if-none annotations, accessors, and unit tests. Also expands ContainerInfo::new(...) signature to accept imports, exports, and resources.
Extraction Modules
src/extraction/*
src/extraction/ascii.rs, src/extraction/utf16.rs, src/extraction/pe_resources.rs, src/extraction/macho_load_commands.rs, src/extraction/mod.rs, src/extraction/dedup.rs
Update all FoundString initializations and helper conversions to populate the new optional fields (set to None) and adapt to FoundString::new/builders where used. No extraction/control-flow logic changes.
Classification / Semantic
src/classification/mod.rs, src/classification/semantic.rs, docs/src/classification.md
Add classify_unc_path(&self, text: &str) -> Option<Tag> (doc/tests updated) and update examples/tests to construct FoundString via new API and include new optional fields in examples. No change to detection algorithms.
Tests
tests/*
tests/classification_integration.rs, tests/test_deduplication.rs
Replace struct literal initializations with FoundString::new(...) and chained with_* calls; assertions remain functionally equivalent; test scaffolding updated for new fields.
Docs & Examples
docs/src/*, codebase_analysis.md, project_plan/specs/...
Change fenced-code language annotations from rust / json to text across docs and examples; update some example blocks to reflect expanded constructors/fields. Presentation-only changes.
CI / Prompts / Project Config
.github/prompts/cicheck.prompt.md, .github/prompts/simplicity-review.prompt.md, .github/copilot-instructions.md, .github/workflows/ci.yml, .serena/project.yml, .serena/.gitignore
Add CI prompt files, a CI aggregation job (ci-gate) and guidance prompt, add Serena project config and ignore /cache. Minor CI matrix change (ARM test disabled).
Manifest
Cargo.toml
Formatting/whitespace normalization of TOML arrays (authors/keywords/categories).
Project Plan / Tickets
project_plan/tickets/Extend_FoundString_data_model_for_demangling_and_debug_support.md
Mark acceptance checklist items completed (presentation-only).

Sequence Diagram(s)

(omitted — changes are data-model and initialization updates; no new multi-component sequential control flow introduced)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

✨ Fields added, builders sing,
Found strings carry a little more ring.
Docs tidied, CI hums along,
New constructors join the throng.
🛠️ None for now, but ready and strong.

🚥 Pre-merge checks | ✅ 4 | ❌ 4
❌ Failed checks (4 warnings)
Check name Status Explanation Resolution
No Unsafe Code ⚠️ Warning PR introduces no unsafe code blocks, functions, or #[allow(unsafe_code)] attributes across modified files, but crate lacks #![forbid(unsafe_code)] declaration in src/lib.rs. Add #![forbid(unsafe_code)] as the first crate-level attribute in src/lib.rs to enforce compile-time guarantees against unsafe code introduction.
Ascii Only ⚠️ Warning PR contains multiple Unicode violations: em-dashes in prompts and docs, emojis (✅❌⚠️→) in documentation and code comments, and non-ASCII test strings (世界). Replace em-dashes with hyphens, remove emojis using plain text alternatives, use ASCII-only test data or separate UTF-8 validation logic.
File Size Limit ⚠️ Warning src/types.rs exceeds 500-line limit at 556 lines; file must be split into modules per check requirements. Refactor src/types.rs by extracting FoundString implementation and builders into src/types/found_string.rs with proper re-exports to maintain public API visibility.
Section Weight Consistency ⚠️ Warning Section weights are not being consistently applied; all section_weight fields initialize to None and builder method is never invoked in extraction paths. Implement weight assignment in extraction modules per specified ranges (read-only strings: 9.0-10.0, tables: 8.0, read-only data: 7.0, writable: 5.0, code: ≤3.0) and clarify if i32 or f32 type is intended.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(types): extend FoundString data model with non_exhaustive and builder pattern' follows Conventional Commits format with correct type and scope, and accurately describes the main changes in the PR.
Description check ✅ Passed The description comprehensively explains the changes across all modified files, provides technical implementation details, testing approach, and risk assessment—directly related to the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 70.00%.
Error Handling ✅ Passed PR extends FoundString model with optional fields and builder methods without modifying error handling. Existing StringyError infrastructure remains unchanged.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enhance_foundstring_model

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai bot added the enhancement New feature or request label Jan 18, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the FoundString data model to support future demangling functionality and debug-mode score breakdowns. It adds #[non_exhaustive] to the struct and implements a constructor/builder pattern for external API consumers.

Changes:

  • Added #[non_exhaustive] attribute to FoundString with constructor and builder methods for forward compatibility
  • Added four new optional fields: original_text, section_weight, semantic_boost, and noise_penalty
  • Updated all struct instantiations throughout the codebase to include the new fields (initialized to None)
  • Changed documentation code blocks from rust` to text` (reduces syntax highlighting)

Reviewed changes

Copilot reviewed 26 out of 27 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/types.rs Core changes: added #[non_exhaustive], new fields, constructor, builder methods, and comprehensive serde tests
src/extraction/ascii.rs Updated struct literals to include new optional fields
src/extraction/utf16.rs Updated struct literals to include new optional fields
src/extraction/pe_resources.rs Updated struct literals to include new optional fields
src/extraction/mod.rs Updated struct literals to include new optional fields
src/extraction/macho_load_commands.rs Updated struct literals to include new optional fields
src/extraction/dedup.rs Updated struct literals and to_found_string() conversion
src/classification/semantic.rs Updated documentation examples with new fields
src/classification/mod.rs Updated documentation examples with new fields
tests/test_deduplication.rs Updated tests to use new builder pattern
tests/classification_integration.rs Updated helper to use new constructor
docs/src/*.md Changed code block markers from rust to text
.github/copilot-instructions.md Added complete example of #[non_exhaustive] pattern
Cargo.toml Minor formatting cleanup
project_plan/tickets/*.md Marked acceptance criteria as complete
.serena/* New tool configuration files (unrelated to ticket)
.github/prompts/* New AI agent prompt files (unrelated to ticket)

### Example Output

```json
```text
Copy link

Copilot AI Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throughout the documentation files, Rust code blocks have been changed from rust` to text, removing syntax highlighting. This significantly degrades the developer experience by removing color coding and syntax support in documentation viewers. Unless there's a specific technical reason for this change (e.g., incompatibility with a documentation tool), these should remain as ````rust blocks. This affects all documentation in docs/src/ including api.md, architecture.md, binary-formats.md, classification.md, performance.md, and string-extraction.md.

Suggested change
```text
```json

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +89
# list of languages for which language servers are started; choose from:
# al bash clojure cpp csharp
# csharp_omnisharp dart elixir elm erlang
# fortran fsharp go groovy haskell
# java julia kotlin lua markdown
# matlab nix pascal perl php
# powershell python python_jedi r rego
# ruby ruby_solargraph rust scala swift
# terraform toml typescript typescript_vts vue
# yaml zig
# (This list may be outdated. For the current list, see values of Language enum here:
# https://github.com/oraios/serena/blob/main/src/solidlsp/ls_config.py
# For some languages, there are alternative language servers, e.g. csharp_omnisharp, ruby_solargraph.)
# Note:
# - For C, use cpp
# - For JavaScript, use typescript
# - For Free Pascal/Lazarus, use pascal
# Special requirements:
# - csharp: Requires the presence of a .sln file in the project folder.
# - pascal: Requires Free Pascal Compiler (fpc) and optionally Lazarus.
# When using multiple languages, the first language server that supports a given file will be used for that file.
# The first language is the default language and the respective language server will be used as a fallback.
# Note that when using the JetBrains backend, language servers are not used and this list is correspondingly ignored.
languages:
- rust

# the encoding used by text files in the project
# For a list of possible encodings, see https://docs.python.org/3.11/library/codecs.html#standard-encodings
encoding: "utf-8"

# whether to use project's .gitignore files to ignore files
ignore_all_files_in_gitignore: true

# list of additional paths to ignore in all projects
# same syntax as gitignore, so you can use * and **
ignored_paths: []

# whether the project is in read-only mode
# If set to true, all editing tools will be disabled and attempts to use them will result in an error
# Added on 2025-04-18
read_only: false

# list of tool names to exclude. We recommend not excluding any tools, see the readme for more details.
# Below is the complete list of tools for convenience.
# To make sure you have the latest list of tools, and to view their descriptions,
# execute `uv run scripts/print_tool_overview.py`.
#
# * `activate_project`: Activates a project by name.
# * `check_onboarding_performed`: Checks whether project onboarding was already performed.
# * `create_text_file`: Creates/overwrites a file in the project directory.
# * `delete_lines`: Deletes a range of lines within a file.
# * `delete_memory`: Deletes a memory from Serena's project-specific memory store.
# * `execute_shell_command`: Executes a shell command.
# * `find_referencing_code_snippets`: Finds code snippets in which the symbol at the given location is referenced.
# * `find_referencing_symbols`: Finds symbols that reference the symbol at the given location (optionally filtered by type).
# * `find_symbol`: Performs a global (or local) search for symbols with/containing a given name/substring (optionally filtered by type).
# * `get_current_config`: Prints the current configuration of the agent, including the active and available projects, tools, contexts, and modes.
# * `get_symbols_overview`: Gets an overview of the top-level symbols defined in a given file.
# * `initial_instructions`: Gets the initial instructions for the current project.
# Should only be used in settings where the system prompt cannot be set,
# e.g. in clients you have no control over, like Claude Desktop.
# * `insert_after_symbol`: Inserts content after the end of the definition of a given symbol.
# * `insert_at_line`: Inserts content at a given line in a file.
# * `insert_before_symbol`: Inserts content before the beginning of the definition of a given symbol.
# * `list_dir`: Lists files and directories in the given directory (optionally with recursion).
# * `list_memories`: Lists memories in Serena's project-specific memory store.
# * `onboarding`: Performs onboarding (identifying the project structure and essential tasks, e.g. for testing or building).
# * `prepare_for_new_conversation`: Provides instructions for preparing for a new conversation (in order to continue with the necessary context).
# * `read_file`: Reads a file within the project directory.
# * `read_memory`: Reads the memory with the given name from Serena's project-specific memory store.
# * `remove_project`: Removes a project from the Serena configuration.
# * `replace_lines`: Replaces a range of lines within a file with new content.
# * `replace_symbol_body`: Replaces the full definition of a symbol.
# * `restart_language_server`: Restarts the language server, may be necessary when edits not through Serena happen.
# * `search_for_pattern`: Performs a search for a pattern in the project.
# * `summarize_changes`: Provides instructions for summarizing the changes made to the codebase.
# * `switch_modes`: Activates modes by providing a list of their names
# * `think_about_collected_information`: Thinking tool for pondering the completeness of collected information.
# * `think_about_task_adherence`: Thinking tool for determining whether the agent is still on track with the current task.
# * `think_about_whether_you_are_done`: Thinking tool for determining whether the task is truly completed.
# * `write_memory`: Writes a named memory (for future reference) to Serena's project-specific memory store.
excluded_tools: []

# initial prompt for the project. It will always be given to the LLM upon activating the project
# (contrary to the memories, which are loaded on demand).
initial_prompt: ""

project_name: "Stringy"
included_optional_tools: []
Copy link

Copilot AI Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR includes unrelated infrastructure files (.serena/project.yml, .serena/.gitignore, .github/prompts/cicheck.prompt.md, .github/prompts/simplicity-review.prompt.md) that are not part of the ticket's scope to extend the FoundString data model. These should be in a separate PR focused on development tooling setup.

Copilot uses AI. Check for mistakes.
Comment on lines +451 to +556
#[cfg(test)]
mod tests {
use super::*;

/// Creates a test FoundString with all optional fields set to None
fn create_test_found_string() -> FoundString {
FoundString {
text: "test_string".to_string(),
original_text: None,
encoding: Encoding::Ascii,
offset: 0x1000,
rva: Some(0x2000),
section: Some(".rodata".to_string()),
length: 11,
tags: vec![Tag::Url],
score: 100,
section_weight: None,
semantic_boost: None,
noise_penalty: None,
source: StringSource::SectionData,
confidence: 0.85,
}
}

#[test]
fn test_found_string_serde_optional_fields_none() {
// Test that optional fields are skipped when None
let found_string = create_test_found_string();
let json = serde_json::to_string(&found_string).expect("Serialization failed");

// Verify optional fields are not present in JSON
assert!(!json.contains("original_text"));
assert!(!json.contains("section_weight"));
assert!(!json.contains("semantic_boost"));
assert!(!json.contains("noise_penalty"));

// Verify required fields are present
assert!(json.contains("text"));
assert!(json.contains("encoding"));
assert!(json.contains("offset"));
}

#[test]
fn test_found_string_serde_optional_fields_some() {
// Test that optional fields are included when Some
let mut found_string = create_test_found_string();
found_string.original_text = Some("_ZN4test6mangled".to_string());
found_string.section_weight = Some(50);
found_string.semantic_boost = Some(25);
found_string.noise_penalty = Some(-10);

let json = serde_json::to_string(&found_string).expect("Serialization failed");

// Verify optional fields are present in JSON
assert!(json.contains("original_text"));
assert!(json.contains("_ZN4test6mangled"));
assert!(json.contains("section_weight"));
assert!(json.contains("semantic_boost"));
assert!(json.contains("noise_penalty"));
}

#[test]
fn test_found_string_serde_roundtrip() {
// Test serialization/deserialization roundtrip with all fields
let mut found_string = create_test_found_string();
found_string.original_text = Some("mangled_name".to_string());
found_string.section_weight = Some(75);
found_string.semantic_boost = Some(30);
found_string.noise_penalty = Some(-5);

let json = serde_json::to_string(&found_string).expect("Serialization failed");
let deserialized: FoundString =
serde_json::from_str(&json).expect("Deserialization failed");

assert_eq!(found_string.text, deserialized.text);
assert_eq!(found_string.original_text, deserialized.original_text);
assert_eq!(found_string.section_weight, deserialized.section_weight);
assert_eq!(found_string.semantic_boost, deserialized.semantic_boost);
assert_eq!(found_string.noise_penalty, deserialized.noise_penalty);
}

#[test]
fn test_found_string_deserialize_missing_optional_fields() {
// Test that missing optional fields default to None during deserialization
let json = r#"{
"text": "test",
"encoding": "Ascii",
"offset": 0,
"rva": null,
"section": null,
"length": 4,
"tags": [],
"score": 0,
"source": "SectionData",
"confidence": 1.0
}"#;

let deserialized: FoundString = serde_json::from_str(json).expect("Deserialization failed");

assert_eq!(deserialized.text, "test");
assert_eq!(deserialized.original_text, None);
assert_eq!(deserialized.section_weight, None);
assert_eq!(deserialized.semantic_boost, None);
assert_eq!(deserialized.noise_penalty, None);
}
}
Copy link

Copilot AI Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new tests validate serialization/deserialization of the new fields but don't test the new constructor FoundString::new() or the builder methods (with_rva(), with_section(), etc.). Consider adding a test that demonstrates the intended API usage pattern, such as creating a FoundString with the constructor and then chaining builder methods. This would serve as both a test and documentation of the API.

Copilot uses AI. Check for mistakes.
Comment on lines +457 to +472
FoundString {
text: "test_string".to_string(),
original_text: None,
encoding: Encoding::Ascii,
offset: 0x1000,
rva: Some(0x2000),
section: Some(".rodata".to_string()),
length: 11,
tags: vec![Tag::Url],
score: 100,
section_weight: None,
semantic_boost: None,
noise_penalty: None,
source: StringSource::SectionData,
confidence: 0.85,
}
Copy link

Copilot AI Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test helper function create_test_found_string() uses struct literal construction with #[non_exhaustive]. This will cause compilation errors for external crates that cannot construct FoundString using struct literals. Since this is in the same crate module (under #[cfg(test)]), it compiles, but it sets a poor example. Consider using the new FoundString::new() constructor followed by builder methods to demonstrate the intended usage pattern.

Suggested change
FoundString {
text: "test_string".to_string(),
original_text: None,
encoding: Encoding::Ascii,
offset: 0x1000,
rva: Some(0x2000),
section: Some(".rodata".to_string()),
length: 11,
tags: vec![Tag::Url],
score: 100,
section_weight: None,
semantic_boost: None,
noise_penalty: None,
source: StringSource::SectionData,
confidence: 0.85,
}
FoundString::new(
"test_string".to_string(),
Encoding::Ascii,
0x1000,
11,
StringSource::SectionData,
)
.with_rva(0x2000)
.with_section(".rodata".to_string())
.with_tags(vec![Tag::Url])
.with_score(100)
.with_confidence(0.85)

Copilot uses AI. Check for mistakes.
Add a gate job that always runs and aggregates the status of all CI jobs.
This enables branch protection rules to require "CI Gate" as a status check,
which will pass if:
- All jobs succeeded, OR
- Conditional jobs were skipped (due to path filters) and other jobs passed

The gate job fails only if any job actually failed or was cancelled.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (11)
docs/src/api.md (1)

11-33: Sync FoundString documentation with the new public API.

The struct example omits the new optional metadata fields, the confidence field, and the #[non_exhaustive] attribute, so external users will not see the current API. Please update this block to match src/types.rs and consider adding a short constructor example for FoundString::new to avoid drift.

docs/src/classification.md (1)

173-189: Update the public example to use FoundString::new.

Because FoundString is now non-exhaustive, the struct literal in the docs will not compile for external users. Use the constructor instead.

Proposed doc update
-let found_string = FoundString {
-    text: "C:\\Windows\\System32\\cmd.exe".to_string(),
-    encoding: Encoding::Ascii,
-    offset: 0,
-    rva: None,
-    section: None,
-    length: 27,
-    tags: Vec::new(),
-    score: 0,
-    source: StringSource::SectionData,
-    confidence: 1.0,
-};
+let found_string = FoundString::new(
+    "C:\\Windows\\System32\\cmd.exe".to_string(),
+    Encoding::Ascii,
+    0,
+    27,
+    StringSource::SectionData,
+);
src/extraction/utf16.rs (2)

823-842: Use FoundString::new() constructor instead of struct literal.

The PR introduces FoundString::new() and builder methods, yet this code still uses struct literal initialization. With #[non_exhaustive] on FoundString, external crates cannot construct via literals. While internal code can, using the constructor is more maintainable.

♻️ Suggested refactor
-                        found_strings.push(FoundString {
-                            text,
-                            original_text: None,
-                            encoding: match byte_order {
-                                ByteOrder::LE => Encoding::Utf16Le,
-                                ByteOrder::BE => Encoding::Utf16Be,
-                                ByteOrder::Auto => unreachable!(),
-                            },
-                            offset: start as u64,
-                            rva: None,
-                            section: None,
-                            length: bytes_for_decoding.len() as u32,
-                            tags: Vec::new(),
-                            score: 0,
-                            section_weight: None,
-                            semantic_boost: None,
-                            noise_penalty: None,
-                            source: StringSource::SectionData,
-                            confidence: utf16_confidence,
-                        });
+                        let encoding = match byte_order {
+                            ByteOrder::LE => Encoding::Utf16Le,
+                            ByteOrder::BE => Encoding::Utf16Be,
+                            ByteOrder::Auto => unreachable!(),
+                        };
+                        found_strings.push(
+                            FoundString::new(
+                                text,
+                                encoding,
+                                start as u64,
+                                bytes_for_decoding.len() as u32,
+                                StringSource::SectionData,
+                            )
+                            .with_confidence(utf16_confidence)
+                        );

1-1036: File exceeds 500 line limit at 1273 lines.

Per coding guidelines, files should stay under 500 lines. Consider splitting into submodules:

  • Core extraction logic (extract.rs)
  • Confidence scoring (scoring.rs)
  • Printability checks (printable.rs)
  • Section-aware extraction (section.rs)
src/extraction/mod.rs (2)

480-522: DRY violation: Symbol handling duplicated between extract() and extract_canonical().

Lines 480-522 and 583-626 contain nearly identical code for adding import/export symbols. Extract this into a helper method.

♻️ Suggested refactor
fn collect_symbol_strings(container_info: &ContainerInfo) -> Vec<FoundString> {
    let mut symbols = Vec::new();
    for import in &container_info.imports {
        symbols.push(FoundString::new(
            import.name.clone(),
            Encoding::Utf8,
            0,
            import.name.len() as u32,
            StringSource::ImportName,
        ));
    }
    for export in &container_info.exports {
        symbols.push(FoundString::new(
            export.name.clone(),
            Encoding::Utf8,
            0,
            export.name.len() as u32,
            StringSource::ExportName,
        ));
    }
    symbols
}

Also applies to: 583-626


1-948: File exceeds 500 line limit at 1517 lines.

This file contains the trait, config, extractor implementation, helper functions, and extensive tests. Consider splitting:

  • trait.rs - StringExtractor trait
  • config.rs - ExtractionConfig (already exists but could absorb more)
  • basic.rs - BasicExtractor implementation
  • Move tests to tests/ or mod/tests.rs
src/extraction/macho_load_commands.rs (1)

101-116: Use FoundString::new() with builder methods for dylib and rpath strings.

Both extract_dylib_strings and extract_rpath_strings use struct literals. Refactor to use the constructor and .with_tags() builder.

♻️ Suggested refactor for extract_dylib_strings
-        strings.push(FoundString {
-            text: lib.to_string(),
-            original_text: None,
-            encoding: Encoding::Utf8,
-            source: StringSource::LoadCommand,
-            tags,
-            section: None,
-            offset: 0,
-            rva: None,
-            length,
-            score: 0,
-            section_weight: None,
-            semantic_boost: None,
-            noise_penalty: None,
-            confidence: 1.0,
-        });
+        strings.push(
+            FoundString::new(
+                lib.to_string(),
+                Encoding::Utf8,
+                0,
+                length,
+                StringSource::LoadCommand,
+            )
+            .with_tags(tags)
+        );

Also applies to: 134-149

src/extraction/pe_resources.rs (2)

427-443: Use FoundString::new() with builder methods.

Same pattern as other files - refactor to use constructor and builders like .with_section() and .with_tags().

♻️ Suggested refactor
-                let found_string = FoundString {
-                    text,
-                    original_text: None,
-                    encoding: Encoding::Utf16Le,
-                    offset: 0, // pelite doesn't provide offsets easily
-                    rva: None,
-                    section: Some(".rsrc".to_string()),
-                    length,
-                    tags: vec![Tag::Version, Tag::Resource],
-                    score: 0,
-                    section_weight: None,
-                    semantic_boost: None,
-                    noise_penalty: None,
-                    source: StringSource::ResourceString,
-                    confidence: 1.0,
-                };
+                let found_string = FoundString::new(
+                    text,
+                    Encoding::Utf16Le,
+                    0,
+                    length,
+                    StringSource::ResourceString,
+                )
+                .with_section(".rsrc".to_string())
+                .with_tags(vec![Tag::Version, Tag::Resource]);

1-828: File exceeds 500 line limit at 1442 lines.

Consider splitting:

  • pe_resources/mod.rs - public API and orchestration
  • pe_resources/version_info.rs - VERSIONINFO extraction
  • pe_resources/string_table.rs - STRINGTABLE extraction
  • pe_resources/manifest.rs - MANIFEST extraction
  • pe_resources/encoding.rs - decode_utf16le, detect_manifest_encoding
src/classification/semantic.rs (2)

1-1640: File exceeds the 500-600 line limit (1640 lines).

Per coding guidelines, files should stay under 500-600 lines. Consider splitting this module - for example, extract the path classification (classify_posix_path, classify_windows_path, classify_unc_path, classify_registry_path) and their validation helpers into a separate paths.rs submodule, and keep URL/domain/IP classification here.


1018-1035: Consider using the constructor in test helper for consistency.

While same-crate tests can use struct literals, using FoundString::new() would exercise the public API and automatically track any signature changes.

Suggested fix
 fn create_test_string(text: &str) -> FoundString {
-    FoundString {
-        text: text.to_string(),
-        original_text: None,
-        encoding: Encoding::Ascii,
-        offset: 0,
-        rva: None,
-        section: None,
-        length: text.len() as u32,
-        tags: Vec::new(),
-        score: 0,
-        section_weight: None,
-        semantic_boost: None,
-        noise_penalty: None,
-        source: StringSource::SectionData,
-        confidence: 1.0,
-    }
+    FoundString::new(
+        text.to_string(),
+        Encoding::Ascii,
+        0,
+        text.len() as u32,
+        StringSource::SectionData,
+    )
 }
🤖 Fix all issues with AI agents
In
`@project_plan/tickets/Extend_FoundString_data_model_for_demangling_and_debug_support.md`:
- Around line 25-30: The acceptance checklist item claiming "No breaking changes
to existing code that creates FoundString instances" is incorrect because adding
#[non_exhaustive] makes struct literal construction a breaking change for
external users; update the wording to acknowledge this and instruct downstream
users to migrate to the provided constructor and builders by naming the symbols
FoundString::new and the builder methods (e.g., FoundString::builder or similar)
so callers switch from struct literals to the new API; also add a short
migration note explaining that existing direct field initialization must be
replaced with FoundString::new(...) or the builder methods to set optional
fields.

In `@src/classification/mod.rs`:
- Around line 31-46: The doc example currently constructs FoundString via struct
literal; update it to use the new constructor and builder pattern by calling
FoundString::new() and chaining the builder methods (e.g., .text(...),
.encoding(...), .offset(...), .length(...), .source(...), .confidence(...),
etc.) to set the same fields as the example so the docs show the preferred API
(refer to FoundString::new and the associated builder method names in the
implementation).

In `@src/classification/semantic.rs`:
- Around line 21-36: The doc example is constructing FoundString via a struct
literal but FoundString is now #[non_exhaustive] and must be created with the
public constructor/builder API; update the example to call FoundString::new(...)
and chain the available builder methods (e.g., .original_text(...),
.encoding(...), .offset(...), .length(...), .tags(...), .source(...),
.confidence(...), etc.) to set the same fields shown in the literal so the
example uses the public FoundString::new and builder methods instead of a struct
literal.
- Around line 353-368: The doc example constructs a FoundString via direct
struct literal; update it to use the public constructor/builder API (the same
pattern used in the module-level example) so the example matches the external
API surface. Locate the example that instantiates FoundString and replace the
literal initialization with calls to FoundString::new(...) or the
FoundStringBuilder (whichever the module-level docs use), setting text,
encoding, offset, length, source, confidence and any optional fields via the
builder methods to mirror the existing public constructor/builder usage.

In `@src/extraction/ascii.rs`:
- Around line 231-246: Replace the repeated FoundString struct literals in
src/extraction/ascii.rs with calls to FoundString::new so you don't duplicate
default values; for each creation site (the three places constructing
FoundString with many None/Vec::new/0/1.0 literals) call FoundString::new and
pass only the varying fields (e.g., text, offset, length, Encoding::Ascii and
StringSource::SectionData or whatever differs) and let the constructor provide
defaults for fields like original_text, rva, section, tags, score,
section_weight, semantic_boost, noise_penalty, and confidence; update all three
creation sites (the three struct literal blocks) to use FoundString::new
consistently.

In `@src/extraction/dedup.rs`:
- Around line 330-345: Replace direct struct construction of FoundString in test
helpers with the public constructor FoundString::new(...) to follow the
preferred API; locate the literal instance (the block that sets text,
original_text, encoding, offset, rva, section, length, tags, score,
section_weight, semantic_boost, noise_penalty, source, confidence) and call
FoundString::new with the appropriate arguments (text, encoding, offset,
section, length, tags, score, source, confidence) letting the constructor
compute rva and default optional fields instead of manually setting them; apply
the same replacement for the other occurrences referenced (the similar literals
around the noted ranges) so tests use FoundString::new consistently.
- Around line 283-298: Replace the manual struct literal in to_found_string()
with the public constructor FoundString::new(...) so the conversion uses the
canonical API; update the call to pass the same values currently set in the
literal (text, original_text, encoding, offset, rva, section, length, tags,
score, section_weight, semantic_boost, noise_penalty, source, confidence)
adapting to FoundString::new's parameter types (wrap options or pass bare values
as required) and ensure to_found_string() returns the value produced by
FoundString::new rather than constructing the struct directly.

In `@src/extraction/mod.rs`:
- Around line 765-780: The UTF-8 extraction path is still creating FoundString
with a struct literal; change it to use the refactored FoundString constructor
used in the other extraction path (e.g., FoundString::new or the project's
equivalent) instead of the literal. Construct the FoundString with the same
values: text, original_text = None, encoding, offset = absolute_offset, rva,
section = section.name.clone(), length = length as u32, tags = Vec::new(), score
= 0, section_weight = None, semantic_boost = None, noise_penalty = None, source
= StringSource::SectionData, and confidence; ensure the call matches the
constructor signature used elsewhere so both UTF-8 and other extraction paths
share the same creation code for consistency.
- Around line 484-499: Replace the inline struct literal used when pushing
import names into all_strings with the new constructor FoundString::new(...):
locate the push that currently builds FoundString { text: import.name.clone(),
original_text: None, encoding: Encoding::Utf8, offset: 0, rva: None, section:
None, length, tags: Vec::new(), score: 0, section_weight: None, semantic_boost:
None, noise_penalty: None, source: StringSource::ImportName, confidence: 1.0 }
and call FoundString::new passing the equivalent values (text =
import.name.clone(), original_text = None, encoding = Encoding::Utf8, offset =
0, rva = None, section = None, length = length, tags = Vec::new(), score = 0,
section_weight = None, semantic_boost = None, noise_penalty = None, source =
StringSource::ImportName, confidence = 1.0) so all FoundString constructions in
this file use the new constructor.

In `@src/types.rs`:
- Around line 257-280: Rename the field `section_weight` in the struct (in
src/types.rs) to a clearer name like `section_weight_contribution` or
`section_score_component` to avoid confusion with `SectionInfo.weight` (which is
an f32); update all references/usages of `section_weight` across the codebase
(serialization/deserialization, docs, tests) and keep the same serde attributes
and Option<i32> type and doc comment content (adjust field name inside the docs
to match), leaving `semantic_boost` and `noise_penalty` unchanged.
- Around line 455-473: The test helper create_test_found_string should construct
the FoundString via its public constructor/builder instead of a struct literal;
replace the literal in create_test_found_string() with FoundString::new(...) and
chain the appropriate builder methods (e.g., .original_text(None),
.encoding(Encoding::Ascii), .offset(0x1000), .rva(Some(0x2000)),
.section(Some(".rodata".to_string())), .length(11), .tags(vec![Tag::Url]),
.score(100), .section_weight(None), .semantic_boost(None), .noise_penalty(None),
.source(StringSource::SectionData), .confidence(0.85)) to produce the same test
values so the test exercises the constructor/builder API (adjust method names to
match the actual builder methods on FoundString).
- Around line 362-367: The with_confidence setter currently accepts any f32 but
the confidence field is documented as 0.0–1.0 and methods
is_high_confidence/is_low_confidence assume that range; update with_confidence
to enforce or validate the range — e.g. add a debug_assert!(0.0..=1.0 contains
&confidence) to catch invalid values in debug builds or clamp the value into
0.0..=1.0 before assigning — modify the pub fn with_confidence(mut self,
confidence: f32) -> Self to perform this validation/clamping and then set
self.confidence so other methods relying on the range remain correct.

Comment on lines +25 to +30
- [x] FoundString struct includes original_text field
- [x] FoundString struct includes optional breakdown fields (section_weight, semantic_boost, noise_penalty)
- [x] All fields properly serialize/deserialize with serde
- [x] Existing tests updated and passing
- [x] Documentation updated with field descriptions and usage examples
- [x] No breaking changes to existing code that creates FoundString instances
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify the "no breaking changes" acceptance item.

#[non_exhaustive] prevents downstream struct literals, so this is a breaking change for external callers. Please reword to reflect the required migration to FoundString::new and builder methods.

Proposed wording
-- [x] No breaking changes to existing code that creates FoundString instances
+- [x] Downstream code uses FoundString::new and builder methods; struct literal construction is no longer supported due to #[non_exhaustive]
🤖 Prompt for AI Agents
In
`@project_plan/tickets/Extend_FoundString_data_model_for_demangling_and_debug_support.md`
around lines 25 - 30, The acceptance checklist item claiming "No breaking
changes to existing code that creates FoundString instances" is incorrect
because adding #[non_exhaustive] makes struct literal construction a breaking
change for external users; update the wording to acknowledge this and instruct
downstream users to migrate to the provided constructor and builders by naming
the symbols FoundString::new and the builder methods (e.g., FoundString::builder
or similar) so callers switch from struct literals to the new API; also add a
short migration note explaining that existing direct field initialization must
be replaced with FoundString::new(...) or the builder methods to set optional
fields.

Comment on lines 31 to 46
//! let found_string = FoundString {
//! text: "C:\\Windows\\System32\\cmd.exe".to_string(),
//! original_text: None,
//! encoding: Encoding::Ascii,
//! offset: 0,
//! rva: None,
//! section: None,
//! length: 27,
//! tags: Vec::new(),
//! score: 0,
//! section_weight: None,
//! semantic_boost: None,
//! noise_penalty: None,
//! source: StringSource::SectionData,
//! confidence: 1.0,
//! };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Update doc example to use FoundString::new() builder pattern.

The PR introduces the constructor and builder methods as the preferred API. Documentation should demonstrate this pattern.

♻️ Suggested refactor
-//! let found_string = FoundString {
-//!     text: "C:\\Windows\\System32\\cmd.exe".to_string(),
-//!     original_text: None,
-//!     encoding: Encoding::Ascii,
-//!     offset: 0,
-//!     rva: None,
-//!     section: None,
-//!     length: 27,
-//!     tags: Vec::new(),
-//!     score: 0,
-//!     section_weight: None,
-//!     semantic_boost: None,
-//!     noise_penalty: None,
-//!     source: StringSource::SectionData,
-//!     confidence: 1.0,
-//! };
+//! let found_string = FoundString::new(
+//!     "C:\\Windows\\System32\\cmd.exe".to_string(),
+//!     Encoding::Ascii,
+//!     0,
+//!     27,
+//!     StringSource::SectionData,
+//! );
🤖 Prompt for AI Agents
In `@src/classification/mod.rs` around lines 31 - 46, The doc example currently
constructs FoundString via struct literal; update it to use the new constructor
and builder pattern by calling FoundString::new() and chaining the builder
methods (e.g., .text(...), .encoding(...), .offset(...), .length(...),
.source(...), .confidence(...), etc.) to set the same fields as the example so
the docs show the preferred API (refer to FoundString::new and the associated
builder method names in the implementation).

Comment on lines 21 to 36
//! let found_string = FoundString {
//! text: "https://example.com/api".to_string(),
//! original_text: None,
//! encoding: Encoding::Ascii,
//! offset: 0,
//! rva: None,
//! section: None,
//! length: 24,
//! tags: Vec::new(),
//! score: 0,
//! section_weight: None,
//! semantic_boost: None,
//! noise_penalty: None,
//! source: StringSource::SectionData,
//! confidence: 1.0,
//! };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Doc example uses struct literal instead of the new constructor/builder pattern.

Since FoundString is now #[non_exhaustive], external crate consumers cannot construct it via struct literals - they must use FoundString::new() and the builder methods. Doc examples should demonstrate the public API users will actually use.

Suggested fix
 //! let classifier = SemanticClassifier::new();
-//! let found_string = FoundString {
-//!     text: "https://example.com/api".to_string(),
-//!     original_text: None,
-//!     encoding: Encoding::Ascii,
-//!     offset: 0,
-//!     rva: None,
-//!     section: None,
-//!     length: 24,
-//!     tags: Vec::new(),
-//!     score: 0,
-//!     section_weight: None,
-//!     semantic_boost: None,
-//!     noise_penalty: None,
-//!     source: StringSource::SectionData,
-//!     confidence: 1.0,
-//! };
+//! let found_string = FoundString::new(
+//!     "https://example.com/api".to_string(),
+//!     Encoding::Ascii,
+//!     0,
+//!     24,
+//!     StringSource::SectionData,
+//! );
🤖 Prompt for AI Agents
In `@src/classification/semantic.rs` around lines 21 - 36, The doc example is
constructing FoundString via a struct literal but FoundString is now
#[non_exhaustive] and must be created with the public constructor/builder API;
update the example to call FoundString::new(...) and chain the available builder
methods (e.g., .original_text(...), .encoding(...), .offset(...), .length(...),
.tags(...), .source(...), .confidence(...), etc.) to set the same fields shown
in the literal so the example uses the public FoundString::new and builder
methods instead of a struct literal.

Comment on lines 353 to 368
/// let found_string = FoundString {
/// text: "https://example.com".to_string(),
/// original_text: None,
/// encoding: Encoding::Ascii,
/// offset: 0,
/// rva: None,
/// section: None,
/// length: 19,
/// tags: Vec::new(),
/// score: 0,
/// section_weight: None,
/// semantic_boost: None,
/// noise_penalty: None,
/// source: StringSource::SectionData,
/// confidence: 1.0,
/// };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Same issue: doc example should use constructor/builder pattern.

Apply the same fix as the module-level doc example for consistency and to accurately reflect the external API.

Suggested fix
 //! let classifier = SemanticClassifier::new();
-//! let found_string = FoundString {
-//!     text: "https://example.com".to_string(),
-//!     original_text: None,
-//!     encoding: Encoding::Ascii,
-//!     offset: 0,
-//!     rva: None,
-//!     section: None,
-//!     length: 19,
-//!     tags: Vec::new(),
-//!     score: 0,
-//!     section_weight: None,
-//!     semantic_boost: None,
-//!     noise_penalty: None,
-//!     source: StringSource::SectionData,
-//!     confidence: 1.0,
-//! };
+//! let found_string = FoundString::new(
+//!     "https://example.com".to_string(),
+//!     Encoding::Ascii,
+//!     0,
+//!     19,
+//!     StringSource::SectionData,
+//! );
🤖 Prompt for AI Agents
In `@src/classification/semantic.rs` around lines 353 - 368, The doc example
constructs a FoundString via direct struct literal; update it to use the public
constructor/builder API (the same pattern used in the module-level example) so
the example matches the external API surface. Locate the example that
instantiates FoundString and replace the literal initialization with calls to
FoundString::new(...) or the FoundStringBuilder (whichever the module-level docs
use), setting text, encoding, offset, length, source, confidence and any
optional fields via the builder methods to mirror the existing public
constructor/builder usage.

Comment on lines 231 to 246
strings.push(FoundString {
text,
original_text: None,
encoding: Encoding::Ascii,
offset: start as u64,
rva: None,
section: None,
length: len as u32,
tags: Vec::new(),
score: 0,
section_weight: None,
semantic_boost: None,
noise_penalty: None,
source: StringSource::SectionData,
confidence: 1.0,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Prefer FoundString::new to avoid repeating defaults.

These three literals duplicate the default field values and will need updates every time FoundString evolves. Using the constructor keeps defaults in one place. Apply the same pattern at all three creation sites.

Proposed refactor
-                    strings.push(FoundString {
-                        text,
-                        original_text: None,
-                        encoding: Encoding::Ascii,
-                        offset: start as u64,
-                        rva: None,
-                        section: None,
-                        length: len as u32,
-                        tags: Vec::new(),
-                        score: 0,
-                        section_weight: None,
-                        semantic_boost: None,
-                        noise_penalty: None,
-                        source: StringSource::SectionData,
-                        confidence: 1.0,
-                    });
+                    strings.push(FoundString::new(
+                        text,
+                        Encoding::Ascii,
+                        start as u64,
+                        len as u32,
+                        StringSource::SectionData,
+                    ));

Also applies to: 265-280, 285-300

🤖 Prompt for AI Agents
In `@src/extraction/ascii.rs` around lines 231 - 246, Replace the repeated
FoundString struct literals in src/extraction/ascii.rs with calls to
FoundString::new so you don't duplicate default values; for each creation site
(the three places constructing FoundString with many None/Vec::new/0/1.0
literals) call FoundString::new and pass only the varying fields (e.g., text,
offset, length, Encoding::Ascii and StringSource::SectionData or whatever
differs) and let the constructor provide defaults for fields like original_text,
rva, section, tags, score, section_weight, semantic_boost, noise_penalty, and
confidence; update all three creation sites (the three struct literal blocks) to
use FoundString::new consistently.

Comment on lines 484 to 499
all_strings.push(FoundString {
text: import.name.clone(),
original_text: None,
encoding: Encoding::Utf8,
offset: 0,
rva: None,
section: None,
length,
tags: Vec::new(),
score: 0,
section_weight: None,
semantic_boost: None,
noise_penalty: None,
source: StringSource::ImportName,
confidence: 1.0,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Use FoundString::new() for import name construction.

Struct literal used instead of the constructor introduced in this PR. This applies to all FoundString constructions in this file.

♻️ Suggested refactor
-                all_strings.push(FoundString {
-                    text: import.name.clone(),
-                    original_text: None,
-                    encoding: Encoding::Utf8,
-                    offset: 0,
-                    rva: None,
-                    section: None,
-                    length,
-                    tags: Vec::new(),
-                    score: 0,
-                    section_weight: None,
-                    semantic_boost: None,
-                    noise_penalty: None,
-                    source: StringSource::ImportName,
-                    confidence: 1.0,
-                });
+                all_strings.push(FoundString::new(
+                    import.name.clone(),
+                    Encoding::Utf8,
+                    0,
+                    length,
+                    StringSource::ImportName,
+                ));
🤖 Prompt for AI Agents
In `@src/extraction/mod.rs` around lines 484 - 499, Replace the inline struct
literal used when pushing import names into all_strings with the new constructor
FoundString::new(...): locate the push that currently builds FoundString { text:
import.name.clone(), original_text: None, encoding: Encoding::Utf8, offset: 0,
rva: None, section: None, length, tags: Vec::new(), score: 0, section_weight:
None, semantic_boost: None, noise_penalty: None, source:
StringSource::ImportName, confidence: 1.0 } and call FoundString::new passing
the equivalent values (text = import.name.clone(), original_text = None,
encoding = Encoding::Utf8, offset = 0, rva = None, section = None, length =
length, tags = Vec::new(), score = 0, section_weight = None, semantic_boost =
None, noise_penalty = None, source = StringSource::ImportName, confidence = 1.0)
so all FoundString constructions in this file use the new constructor.

Comment on lines 765 to 780
let found_string = FoundString {
text,
original_text: None,
encoding,
offset: absolute_offset,
rva,
section: Some(section.name.clone()),
length: length as u32,
tags: Vec::new(),
score: 0,
section_weight: None,
semantic_boost: None,
noise_penalty: None,
source: StringSource::SectionData,
confidence,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consistent: UTF-8 extraction also needs constructor refactor.

Same struct literal pattern in the UTF-8 extraction path.

🤖 Prompt for AI Agents
In `@src/extraction/mod.rs` around lines 765 - 780, The UTF-8 extraction path is
still creating FoundString with a struct literal; change it to use the
refactored FoundString constructor used in the other extraction path (e.g.,
FoundString::new or the project's equivalent) instead of the literal. Construct
the FoundString with the same values: text, original_text = None, encoding,
offset = absolute_offset, rva, section = section.name.clone(), length = length
as u32, tags = Vec::new(), score = 0, section_weight = None, semantic_boost =
None, noise_penalty = None, source = StringSource::SectionData, and confidence;
ensure the call matches the constructor signature used elsewhere so both UTF-8
and other extraction paths share the same creation code for consistency.

Comment on lines +257 to +280
/// Section weight contribution to the final score (debug only)
///
/// When debug mode is enabled, this field contains the weight assigned based on
/// the section where the string was found. Higher weights indicate sections more
/// likely to contain meaningful strings (e.g., .rodata vs .text). This is `None`
/// unless explicitly populated by the ranking system in debug mode.
#[serde(skip_serializing_if = "Option::is_none", default)]
pub section_weight: Option<i32>,
/// Semantic classification boost to the final score (debug only)
///
/// When debug mode is enabled, this field contains the score boost applied based on
/// semantic tags (URLs, file paths, GUIDs, etc.). Strings with valuable semantic
/// meaning receive positive boosts. This is `None` unless explicitly populated by
/// the ranking system in debug mode.
#[serde(skip_serializing_if = "Option::is_none", default)]
pub semantic_boost: Option<i32>,
/// Noise penalty applied to the final score (debug only)
///
/// When debug mode is enabled, this field contains the penalty applied for noise
/// characteristics (low confidence, repetitive patterns, etc.). Higher penalties
/// indicate strings more likely to be noise. This is `None` unless explicitly
/// populated by the ranking system in debug mode.
#[serde(skip_serializing_if = "Option::is_none", default)]
pub noise_penalty: Option<i32>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Field naming could be more precise to avoid confusion with SectionInfo.weight.

section_weight as Option<i32> stores the score contribution derived from the section's weight, while SectionInfo.weight is f32. The doc comments clarify this, but consider names like section_weight_contribution or section_score_component to make the distinction explicit. Low priority - current naming works with the documentation.

🤖 Prompt for AI Agents
In `@src/types.rs` around lines 257 - 280, Rename the field `section_weight` in
the struct (in src/types.rs) to a clearer name like
`section_weight_contribution` or `section_score_component` to avoid confusion
with `SectionInfo.weight` (which is an f32); update all references/usages of
`section_weight` across the codebase (serialization/deserialization, docs,
tests) and keep the same serde attributes and Option<i32> type and doc comment
content (adjust field name inside the docs to match), leaving `semantic_boost`
and `noise_penalty` unchanged.

Comment on lines +362 to +367
/// Sets the confidence
#[must_use]
pub fn with_confidence(mut self, confidence: f32) -> Self {
self.confidence = confidence;
self
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

with_confidence does not validate the documented 0.0-1.0 range.

The confidence field is documented as "(0.0-1.0)" and is_high_confidence/is_low_confidence methods assume this range. Consider adding a debug assertion or clamping to prevent invalid values.

Suggested fix with debug assertion
 /// Sets the confidence
 #[must_use]
 pub fn with_confidence(mut self, confidence: f32) -> Self {
+    debug_assert!(
+        (0.0..=1.0).contains(&confidence),
+        "confidence must be in range 0.0-1.0, got {confidence}"
+    );
     self.confidence = confidence;
     self
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Sets the confidence
#[must_use]
pub fn with_confidence(mut self, confidence: f32) -> Self {
self.confidence = confidence;
self
}
/// Sets the confidence
#[must_use]
pub fn with_confidence(mut self, confidence: f32) -> Self {
debug_assert!(
(0.0..=1.0).contains(&confidence),
"confidence must be in range 0.0-1.0, got {confidence}"
);
self.confidence = confidence;
self
}
🤖 Prompt for AI Agents
In `@src/types.rs` around lines 362 - 367, The with_confidence setter currently
accepts any f32 but the confidence field is documented as 0.0–1.0 and methods
is_high_confidence/is_low_confidence assume that range; update with_confidence
to enforce or validate the range — e.g. add a debug_assert!(0.0..=1.0 contains
&confidence) to catch invalid values in debug builds or clamp the value into
0.0..=1.0 before assigning — modify the pub fn with_confidence(mut self,
confidence: f32) -> Self to perform this validation/clamping and then set
self.confidence so other methods relying on the range remain correct.

Comment on lines +455 to +473
/// Creates a test FoundString with all optional fields set to None
fn create_test_found_string() -> FoundString {
FoundString {
text: "test_string".to_string(),
original_text: None,
encoding: Encoding::Ascii,
offset: 0x1000,
rva: Some(0x2000),
section: Some(".rodata".to_string()),
length: 11,
tags: vec![Tag::Url],
score: 100,
section_weight: None,
semantic_boost: None,
noise_penalty: None,
source: StringSource::SectionData,
confidence: 0.85,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Test helper uses struct literal instead of constructor.

For consistency with the PR's goal of promoting the constructor/builder pattern, consider using FoundString::new() with builder methods. This would also verify the API works correctly.

Suggested fix
 fn create_test_found_string() -> FoundString {
-    FoundString {
-        text: "test_string".to_string(),
-        original_text: None,
-        encoding: Encoding::Ascii,
-        offset: 0x1000,
-        rva: Some(0x2000),
-        section: Some(".rodata".to_string()),
-        length: 11,
-        tags: vec![Tag::Url],
-        score: 100,
-        section_weight: None,
-        semantic_boost: None,
-        noise_penalty: None,
-        source: StringSource::SectionData,
-        confidence: 0.85,
-    }
+    FoundString::new(
+        "test_string".to_string(),
+        Encoding::Ascii,
+        0x1000,
+        11,
+        StringSource::SectionData,
+    )
+    .with_rva(0x2000)
+    .with_section(".rodata".to_string())
+    .with_tags(vec![Tag::Url])
+    .with_score(100)
+    .with_confidence(0.85)
 }
🤖 Prompt for AI Agents
In `@src/types.rs` around lines 455 - 473, The test helper
create_test_found_string should construct the FoundString via its public
constructor/builder instead of a struct literal; replace the literal in
create_test_found_string() with FoundString::new(...) and chain the appropriate
builder methods (e.g., .original_text(None), .encoding(Encoding::Ascii),
.offset(0x1000), .rva(Some(0x2000)), .section(Some(".rodata".to_string())),
.length(11), .tags(vec![Tag::Url]), .score(100), .section_weight(None),
.semantic_boost(None), .noise_penalty(None), .source(StringSource::SectionData),
.confidence(0.85)) to produce the same test values so the test exercises the
constructor/builder API (adjust method names to match the actual builder methods
on FoundString).

@coderabbitai coderabbitai bot added the documentation Improvements or additions to documentation label Jan 18, 2026
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @.github/workflows/ci.yml:
- Around line 185-206: The CI gate script currently checks results for quality,
msrv, test, test-cross-platform, and coverage but omits the changes job,
allowing failures there to be masked; update the Check job results step to also
evaluate needs.changes.result and fail the job if it equals "failure" or
"cancelled" (mirror the existing pattern used for quality and msrv), and emit a
clear error message like "::error::Changes job failed" when that condition is
met so the overall gate correctly fails on changes job errors.
- Around line 190-215: The workflow uses invalid dot-notation for a hyphenated
job id (needs.test-cross-platform.result); update all accesses of that job to
bracket notation (needs['test-cross-platform'].result), including the initial
echo and the subsequent if condition that checks its value; also scan for any
other hyphenated job IDs (e.g., any occurrence of needs.<hyphenated-id>.result)
and convert them to needs['<hyphenated-id>'].result to avoid runtime evaluation
errors.

Comment on lines +185 to +206
needs: [quality, msrv, test, test-cross-platform, coverage]
if: always()
steps:
- name: Check job results
run: |
echo "Quality: ${{ needs.quality.result }}"
echo "MSRV: ${{ needs.msrv.result }}"
echo "Test: ${{ needs.test.result }}"
echo "Test Cross-Platform: ${{ needs.test-cross-platform.result }}"
echo "Coverage: ${{ needs.coverage.result }}"

# Fail if any required job failed
# Jobs that were skipped (due to path filters) are OK
if [[ "${{ needs.quality.result }}" == "failure" || "${{ needs.quality.result }}" == "cancelled" ]]; then
echo "::error::Quality checks failed"
exit 1
fi

if [[ "${{ needs.msrv.result }}" == "failure" || "${{ needs.msrv.result }}" == "cancelled" ]]; then
echo "::error::MSRV check failed"
exit 1
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Include changes in the gate to avoid masked failures.
If the changes job fails, dependent jobs are skipped and the gate still passes because it never checks changes. This can hide CI breakage.

✅ Suggested fix
-    needs: [quality, msrv, test, test-cross-platform, coverage]
+    needs: [changes, quality, msrv, test, test-cross-platform, coverage]
...
-          echo "Quality: ${{ needs.quality.result }}"
+          echo "Changes: ${{ needs.changes.result }}"
+          echo "Quality: ${{ needs.quality.result }}"
...
+          if [[ "${{ needs.changes.result }}" == "failure" || "${{ needs.changes.result }}" == "cancelled" ]]; then
+            echo "::error::Changes filter failed"
+            exit 1
+          fi
🤖 Prompt for AI Agents
In @.github/workflows/ci.yml around lines 185 - 206, The CI gate script
currently checks results for quality, msrv, test, test-cross-platform, and
coverage but omits the changes job, allowing failures there to be masked; update
the Check job results step to also evaluate needs.changes.result and fail the
job if it equals "failure" or "cancelled" (mirror the existing pattern used for
quality and msrv), and emit a clear error message like "::error::Changes job
failed" when that condition is met so the overall gate correctly fails on
changes job errors.

Comment on lines +190 to +215
echo "Quality: ${{ needs.quality.result }}"
echo "MSRV: ${{ needs.msrv.result }}"
echo "Test: ${{ needs.test.result }}"
echo "Test Cross-Platform: ${{ needs.test-cross-platform.result }}"
echo "Coverage: ${{ needs.coverage.result }}"

# Fail if any required job failed
# Jobs that were skipped (due to path filters) are OK
if [[ "${{ needs.quality.result }}" == "failure" || "${{ needs.quality.result }}" == "cancelled" ]]; then
echo "::error::Quality checks failed"
exit 1
fi

if [[ "${{ needs.msrv.result }}" == "failure" || "${{ needs.msrv.result }}" == "cancelled" ]]; then
echo "::error::MSRV check failed"
exit 1
fi

if [[ "${{ needs.test.result }}" == "failure" || "${{ needs.test.result }}" == "cancelled" ]]; then
echo "::error::Tests failed"
exit 1
fi

if [[ "${{ needs.test-cross-platform.result }}" == "failure" || "${{ needs.test-cross-platform.result }}" == "cancelled" ]]; then
echo "::error::Cross-platform tests failed"
exit 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

GitHub Actions needs context job id hyphen dot notation bracket notation

💡 Result:

Short answer: use bracket notation when the identifier contains characters that can't be used in dot-notation (or when it begins with a number). Job IDs themselves must start with a letter or underscore and may contain letters, digits, hyphens and underscores; but when you reference a context/property whose name contains a hyphen/dot/other special char, use bracket/string indexing.

Key facts and examples:

  • Job-id rules (what you may name a job): must start with a letter or _ and contain only alphanumeric characters, -, or _. [1]
  • When an expression needs to access a property whose key contains a character not valid for dot notation, use bracket notation: example usages:
    • needs['my-job-with-hyphen'].outputs.some_output
    • job.services['my-service'].id
    • github['some-context-with-hyphen'] (general form)
      These follow GitHub Actions expression/object-indexing behavior described in the contexts/docs. [1][2]

Example:

  • jobs:
    build-and-test:
    runs-on: ubuntu-latest
    needs: []
  • Another job referencing output:
    deploy:
    needs: ['build-and-test']
    if: needs['build-and-test'].outputs.artifact_ready == 'true'

Sources:

  • Workflow syntax / jobs and job_id rules. [1]
  • Contexts and expression syntax (job/steps/services access and example of using object keys). [2]

🏁 Script executed:

head -n 220 .github/workflows/ci.yml | tail -n 35

Repository: EvilBit-Labs/Stringy

Length of output: 1569


Fix context access syntax for hyphenated job IDs.

needs.test-cross-platform.result uses invalid dot notation for a job ID containing a hyphen. GitHub Actions expressions require bracket notation to access properties with hyphens in their names. The workflow will fail at runtime when evaluating these expressions.

Suggested fix
-          echo "Test Cross-Platform: ${{ needs.test-cross-platform.result }}"
+          echo "Test Cross-Platform: ${{ needs['test-cross-platform'].result }}"
           
           # Fail if any required job failed
           # Jobs that were skipped (due to path filters) are OK
-          if [[ "${{ needs.test-cross-platform.result }}" == "failure" || "${{ needs.test-cross-platform.result }}" == "cancelled" ]]; then
+          if [[ "${{ needs['test-cross-platform'].result }}" == "failure" || "${{ needs['test-cross-platform'].result }}" == "cancelled" ]]; then
             echo "::error::Cross-platform tests failed"
             exit 1
           fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
echo "Quality: ${{ needs.quality.result }}"
echo "MSRV: ${{ needs.msrv.result }}"
echo "Test: ${{ needs.test.result }}"
echo "Test Cross-Platform: ${{ needs.test-cross-platform.result }}"
echo "Coverage: ${{ needs.coverage.result }}"
# Fail if any required job failed
# Jobs that were skipped (due to path filters) are OK
if [[ "${{ needs.quality.result }}" == "failure" || "${{ needs.quality.result }}" == "cancelled" ]]; then
echo "::error::Quality checks failed"
exit 1
fi
if [[ "${{ needs.msrv.result }}" == "failure" || "${{ needs.msrv.result }}" == "cancelled" ]]; then
echo "::error::MSRV check failed"
exit 1
fi
if [[ "${{ needs.test.result }}" == "failure" || "${{ needs.test.result }}" == "cancelled" ]]; then
echo "::error::Tests failed"
exit 1
fi
if [[ "${{ needs.test-cross-platform.result }}" == "failure" || "${{ needs.test-cross-platform.result }}" == "cancelled" ]]; then
echo "::error::Cross-platform tests failed"
exit 1
echo "Quality: ${{ needs.quality.result }}"
echo "MSRV: ${{ needs.msrv.result }}"
echo "Test: ${{ needs.test.result }}"
echo "Test Cross-Platform: ${{ needs['test-cross-platform'].result }}"
echo "Coverage: ${{ needs.coverage.result }}"
# Fail if any required job failed
# Jobs that were skipped (due to path filters) are OK
if [[ "${{ needs.quality.result }}" == "failure" || "${{ needs.quality.result }}" == "cancelled" ]]; then
echo "::error::Quality checks failed"
exit 1
fi
if [[ "${{ needs.msrv.result }}" == "failure" || "${{ needs.msrv.result }}" == "cancelled" ]]; then
echo "::error::MSRV check failed"
exit 1
fi
if [[ "${{ needs.test.result }}" == "failure" || "${{ needs.test.result }}" == "cancelled" ]]; then
echo "::error::Tests failed"
exit 1
fi
if [[ "${{ needs['test-cross-platform'].result }}" == "failure" || "${{ needs['test-cross-platform'].result }}" == "cancelled" ]]; then
echo "::error::Cross-platform tests failed"
exit 1
🤖 Prompt for AI Agents
In @.github/workflows/ci.yml around lines 190 - 215, The workflow uses invalid
dot-notation for a hyphenated job id (needs.test-cross-platform.result); update
all accesses of that job to bracket notation
(needs['test-cross-platform'].result), including the initial echo and the
subsequent if condition that checks its value; also scan for any other
hyphenated job IDs (e.g., any occurrence of needs.<hyphenated-id>.result) and
convert them to needs['<hyphenated-id>'].result to avoid runtime evaluation
errors.

@codecov
Copy link

codecov bot commented Jan 18, 2026

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

ℹ️ You can also turn on project coverage checks and project coverage reporting on Pull Request comment

Thanks for integrating Codecov - We've got you covered ☂️

@unclesp1d3r unclesp1d3r merged commit 0f16b92 into main Jan 18, 2026
20 of 21 checks passed
@unclesp1d3r unclesp1d3r deleted the enhance_foundstring_model branch January 18, 2026 04:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments