Skip to content

Conversation

@vinitkumar
Copy link
Owner

@vinitkumar vinitkumar commented Jan 28, 2026

Summary

Fixes linker error when building fuzz targets with Python 3.14, where PyUnicode_DATA and PyUnicode_KIND symbols are now inline macros instead of exported functions.

Changes

  • Make pyo3 dependency optional behind python feature flag (enabled by default)
  • Add rlib crate-type to support both cdylib (Python extension) and rlib (fuzzing)
  • Gate all Python-specific code with #[cfg(feature = "python")]
  • Make pure Rust utility functions public for fuzzing: escape_xml, wrap_cdata, is_valid_xml_name, make_valid_xml_name, make_attr_string
  • Add comprehensive unit tests for all XML utility functions
  • Add fuzz targets for testing core functions

Testing

# Verify Python extension still builds
cargo check

# Verify builds without Python (for fuzzing)
cargo check --no-default-features

# Run fuzz targets
cargo +nightly fuzz run fuzz_escape_xml -- -runs=100

Summary by Sourcery

Make the Rust json2xml library usable both as a Python extension and as a pure Rust crate by gating Python bindings behind a feature flag and exposing core XML utilities for broader use and testing.

New Features:

  • Expose core XML utility functions as public Rust APIs for use outside the Python extension.
  • Add a dedicated fuzzing crate with multiple fuzz targets for XML utility functions.

Bug Fixes:

  • Resolve Python 3.14 fuzz build/linking issues by making the PyO3 dependency optional and separating Python-specific code behind a feature flag.

Enhancements:

  • Gate all Python-facing code and PyO3 integration behind an optional python feature to support non-Python builds.

Build:

  • Add rlib to the crate types to support both Python cdylib builds and pure Rust consumers.
  • Introduce Cargo features with python enabled by default and make pyo3 an optional dependency.

Tests:

  • Add comprehensive unit test coverage for XML escaping, CDATA wrapping, XML name validation, XML name normalization, and attribute string generation.
  • Add fuzzing targets and corpora for exercising XML escaping, CDATA wrapping, XML name validation and normalization, and attribute string generation.

- Make pyo3 dependency optional behind 'python' feature flag
- Add rlib crate-type to support both cdylib (Python) and rlib (fuzzing)
- Gate all Python-specific code with #[cfg(feature = "python")]
- Make pure Rust functions (escape_xml, wrap_cdata, etc.) public for fuzzing
- Add comprehensive unit tests for XML utility functions

Fixes linker error with Python 3.14 where PyUnicode_DATA and
PyUnicode_KIND symbols are now inline macros, not exported functions.
Fuzz targets can now build without linking against Python.

Amp-Thread-ID: https://ampcode.com/threads/T-019c0425-ac62-76d8-9d59-4d6aba3edf45
Co-authored-by: Amp <amp@ampcode.com>
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Jan 28, 2026

Reviewer's Guide

Makes the Rust json2xml_rs crate usable both as a Python extension and as a pure Rust library for fuzzing by gating all PyO3 integration behind a python feature, exposing core XML helpers as public APIs, and adding tests and fuzz targets for these helpers.

Updated class diagram for json2xml_rs core XML utilities and Python-gated API

classDiagram
    class Json2xml_rs {
        +escape_xml(s: &str) String
        +wrap_cdata(s: &str) String
        +is_valid_xml_name(key: &str) bool
        +make_valid_xml_name(key: &str) (String, Option<(String, String)>)
        +make_attr_string(attrs: &[(String, String)]) String
    }

    class ConvertConfig {
        <<feature_python_only>>
        -attr_type: bool
        -cdata: bool
        -item_wrap: bool
        -list_headers: bool
    }

    class PythonApi {
        <<feature_python_only>>
        +dicttoxml(obj: &PyAny, root: bool, custom_root: &str, attr_type: bool, item_wrap: bool, cdata: bool, list_headers: bool) PyResult<PyBytes>
        +escape_xml_py(s: &str) String
        +wrap_cdata_py(s: &str) String
        +json2xml_rs(m: &PyModule) PyResult<()>
    }

    class Converters {
        <<feature_python_only>>
        +convert_value(py: Python, obj: &PyAny, key: &str, config: &ConvertConfig, indent: usize, level: usize) PyResult<String>
        +convert_string(key: &str, val: &str, config: &ConvertConfig) PyResult<String>
        +convert_number(key: &str, val: &str, config: &ConvertConfig) PyResult<String>
        +convert_bool(key: &str, val: bool, config: &ConvertConfig) PyResult<String>
        +convert_none(key: &str, config: &ConvertConfig) PyResult<String>
        +convert_dict(py: Python, dict: &PyDict, key: &str, config: &ConvertConfig, indent: usize, level: usize) PyResult<String>
        +convert_list(py: Python, list: &PyList, key: &str, config: &ConvertConfig, indent: usize, level: usize) PyResult<String>
    }

    Json2xml_rs <.. PythonApi : depends_on
    Json2xml_rs <.. Converters : uses
    Converters --> ConvertConfig
    PythonApi --> Converters
    PythonApi --> Json2xml_rs
Loading

File-Level Changes

Change Details Files
Gate all Python/PyO3 integration behind an optional python feature while keeping default behavior unchanged.
  • Wrap PyO3 imports, ConvertConfig, helper functions, and the json2xml_rs Python module definitions with #[cfg(feature = "python")] so they are only compiled when the feature is enabled.
  • Introduce a [features] section in Cargo.toml with default = ["python"] and python = ["pyo3/extension-module", "dep:pyo3"] so PyO3 is optional but enabled by default.
  • Change the pyo3 dependency to optional = true and move the extension-module feature into the python feature to avoid linking PyO3 when building without Python.
rust/src/lib.rs
rust/Cargo.toml
Allow the crate to be built both as a Python cdylib and as an rlib consumable by Rust fuzz targets.
  • Update the [lib] section to set crate-type = ["cdylib", "rlib"] so the same crate can produce a Python extension and a standard Rust library.
  • Ensure fuzz crate depends on json2xml_rs with default-features = false to get the rlib-only build without Python.
rust/Cargo.toml
rust/fuzz/Cargo.toml
Expose core XML helper functions as public Rust APIs for reuse in tests and fuzz targets.
  • Change escape_xml, wrap_cdata, is_valid_xml_name, make_valid_xml_name, and make_attr_string from private fn to pub fn so they can be imported from the fuzz targets and unit tests.
  • Keep these helpers free of PyO3-specific types so they compile even when the python feature is disabled.
rust/src/lib.rs
Add comprehensive unit tests for XML helper functions to lock in expected behavior.
  • Introduce a #[cfg(test)] mod tests in lib.rs with submodules for each helper (escape_xml, wrap_cdata, is_valid_xml_name, make_valid_xml_name, make_attr_string).
  • Cover normal, boundary, and edge cases such as empty strings, special characters, invalid XML names, numeric keys, name attributes, and attribute escaping.
rust/src/lib.rs
Introduce a dedicated fuzzing crate and fuzz targets for the core XML helpers.
  • Add rust/fuzz/Cargo.toml configured for cargo-fuzz with dependencies on libfuzzer-sys, arbitrary, and json2xml_rs (with default-features = false).
  • Create fuzz targets for the helper functions (fuzz_escape_xml, fuzz_wrap_cdata, fuzz_is_valid_xml_name, fuzz_make_valid_xml_name, fuzz_make_attr_string).
  • In fuzz_make_attr_string, define an AttrInput struct with Arbitrary derive and assert basic invariants on the generated attribute string to catch panics or structural regressions.
  • Add initial corpus entries under rust/fuzz/corpus/fuzz_escape_xml/ to seed fuzzing.
rust/fuzz/Cargo.toml
rust/fuzz/fuzz_targets/fuzz_escape_xml.rs
rust/fuzz/fuzz_targets/fuzz_wrap_cdata.rs
rust/fuzz/fuzz_targets/fuzz_is_valid_xml_name.rs
rust/fuzz/fuzz_targets/fuzz_make_valid_xml_name.rs
rust/fuzz/fuzz_targets/fuzz_make_attr_string.rs
rust/fuzz/corpus/fuzz_escape_xml/*

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • Exposing the XML helper functions as pub just for fuzzing broadens the public crate API; consider keeping them non-public and adding a dedicated fuzz feature or a pub(crate) + fuzz helper wrapper instead.
  • There are many repeated #[cfg(feature = "python")] annotations on Python-related items; consider gating a larger module or mod python; submodule instead to reduce repetition and make the separation between core Rust logic and Python bindings clearer.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Exposing the XML helper functions as `pub` just for fuzzing broadens the public crate API; consider keeping them non-public and adding a dedicated `fuzz` feature or a `pub(crate)` + fuzz helper wrapper instead.
- There are many repeated `#[cfg(feature = "python")]` annotations on Python-related items; consider gating a larger module or `mod python;` submodule instead to reduce repetition and make the separation between core Rust logic and Python bindings clearer.

## Individual Comments

### Comment 1
<location> `rust/fuzz/fuzz_targets/fuzz_make_attr_string.rs:22-25` </location>
<code_context>
+    assert!(result.starts_with(' '), "Attribute string should start with space");
+    
+    // 3. Each attribute should produce key="value" format
+    for (key, _value) in &input.attrs {
+        // Key should appear in the result
+        assert!(result.contains(key), "Key '{}' should appear in result", key);
+    }
+    
</code_context>

<issue_to_address>
**suggestion (testing):** The key-presence invariant is quite weak and may pass even if attribute formatting is malformed.

Since `contains` is substring-based, this can pass even if the `key="value"` structure is broken or keys overlap (e.g. `"a"` vs `"aa"`). To make the fuzz test more robust, consider matching a more specific pattern like `format!(" {}=\"", key)` or parsing/splitting the attribute string and explicitly validating `key="value"` pairs.

```suggestion
    // 2. Result should start with space (for XML formatting)
    assert!(result.starts_with(' '), "Attribute string should start with space");

    // 3. Each attribute should produce a ` key="value"`-like fragment.
    //    We check for the more specific pattern ` {key}="` to avoid
    //    passing on overlapping keys (e.g. "a" vs "aa") or malformed formatting.
    for (key, _value) in &input.attrs {
        let expected_fragment = format!(" {}=\"", key);
        assert!(
            result.contains(&expected_fragment),
            "Attribute fragment '{}' should appear in result '{}'",
            expected_fragment,
            result
        );
    }
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@codecov
Copy link

codecov bot commented Jan 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.89%. Comparing base (d346999) to head (29131ca).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #270   +/-   ##
=======================================
  Coverage   95.89%   95.89%           
=======================================
  Files           5        5           
  Lines         463      463           
=======================================
  Hits          444      444           
  Misses         19       19           
Flag Coverage Δ
unittests 95.89% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

vinitkumar and others added 2 commits January 28, 2026 16:07
Use more specific pattern matching (` key="`) instead of just checking
if key exists as substring. This avoids false positives with overlapping
keys (e.g. 'a' vs 'aa') or malformed attribute formatting.

Suggested-by: sourcery-ai
Amp-Thread-ID: https://ampcode.com/threads/T-019c0425-ac62-76d8-9d59-4d6aba3edf45
Co-authored-by: Amp <amp@ampcode.com>
@vinitkumar vinitkumar merged commit 94a60ab into master Jan 28, 2026
62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants