Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ jobs:
- name: Install dependencies
run: |
uv venv
uv pip install -e .
uv pip install -e ".[dev]"
- name: Run ty
run: |
source .venv/bin/activate
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/rust-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ jobs:
working-directory: rust
run: cargo clippy --all-targets --all-features -- -D warnings

- name: Run Rust unit tests
working-directory: rust
run: cargo test --no-default-features

rust-test:
name: Build & Test (${{ matrix.os }}, Python ${{ matrix.python-version }})
runs-on: ${{ matrix.os }}
Expand Down
64 changes: 34 additions & 30 deletions BENCHMARKS.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,77 +6,81 @@ Comprehensive performance comparison between all json2xml implementations.

- **Machine**: Apple Silicon (M-series, aarch64)
- **OS**: macOS
- **Date**: January 16, 2026
- **Date**: January 28, 2026

### Implementations Tested

| Implementation | Type | Notes |
|----------------|------|-------|
| Python | Library | Pure Python (json2xml) |
| Rust | Library | Native extension via PyO3 (json2xml-rs) |
| Go | CLI | Standalone binary (json2xml-go) |
| Go | CLI | Standalone binary (json2xml-go v1.0.0) |
| Zig | CLI | Standalone binary (json2xml-zig) |

## Test Data

| Size | Description | Bytes |
|------|-------------|-------|
| Small | Simple object `{"name": "John", "age": 30, "city": "New York"}` | 47 |
| Medium | 10 generated records with nested structures | 3,212 |
| Medium | 10 generated records with nested structures | ~3,208 |
| bigexample.json | Real-world patent data | 2,018 |
| Large | 100 generated records with nested structures | 32,226 |
| Very Large | 1,000 generated records with nested structures | 323,126 |
| Large | 100 generated records with nested structures | ~32,205 |
| Very Large | 1,000 generated records with nested structures | ~323,119 |

## Results

### Performance Summary

| Test Case | Python | Rust | Go | Zig |
|-----------|--------|------|-----|-----|
| Small (47B) | 40.12µs | 1.45µs | 4.65ms | 3.74ms |
| Medium (3.2KB) | 2.14ms | 71.28µs | 4.07ms | 3.28ms |
| bigexample (2KB) | 819.46µs | 32.88µs | 4.02ms | 2.96ms |
| Large (32KB) | 21.08ms | 739.89µs | 4.05ms | 6.11ms |
| Very Large (323KB) | 212.61ms | 7.55ms | 4.38ms | 33.24ms |
| Small (47B) | 41.88µs | 1.66µs | 4.52ms | 2.80ms |
| Medium (3.2KB) | 2.19ms | 71.85µs | 4.33ms | 2.18ms |
| bigexample (2KB) | 854.38µs | 30.89µs | 4.28ms | 2.12ms |
| Large (32KB) | 21.57ms | 672.96µs | 4.47ms | 2.48ms |
| Very Large (323KB) | 216.52ms | 6.15ms | 4.44ms | 5.54ms |

### Speedup vs Pure Python

| Test Case | Rust | Go | Zig |
|-----------|------|-----|-----|
| Small (47B) | **27.6x** | 0.0x* | 0.0x* |
| Medium (3.2KB) | **30.0x** | 0.5x* | 0.7x* |
| bigexample (2KB) | **24.9x** | 0.2x* | 0.3x* |
| Large (32KB) | **28.5x** | 5.2x | 3.5x |
| Very Large (323KB) | **28.2x** | **48.5x** | 6.4x |
| Small (47B) | **25.2x** | 0.0x* | 0.0x* |
| Medium (3.2KB) | **30.5x** | 0.5x* | 1.0x* |
| bigexample (2KB) | **27.7x** | 0.2x* | 0.4x* |
| Large (32KB) | **32.1x** | 4.8x | **8.7x** |
| Very Large (323KB) | **35.2x** | **48.8x** | **39.1x** |

*CLI tools have process spawn overhead (~3-4ms) which dominates for small inputs
*CLI tools have process spawn overhead (~2-4ms) which dominates for small inputs

## Key Observations

### 1. Rust Extension is the Best Choice for Python Users 🦀

The Rust extension (json2xml-rs) provides:
- **~28x faster** than pure Python consistently across all input sizes
- **~25-35x faster** than pure Python consistently across all input sizes
- **Zero process overhead** - called directly from Python
- **Automatic fallback** - pure Python used if Rust unavailable
- **Easy install**: `pip install json2xml[fast]`

### 2. Go Excels for Large CLI Workloads 🚀
### 2. Go Excels for Very Large CLI Workloads 🚀

For very large inputs (323KB+):
- **48.5x faster** than Python
- But ~3-4ms startup overhead hurts small file performance
- **48.8x faster** than Python
- But ~4ms startup overhead hurts small file performance
- Best for batch processing or large file conversions

### 3. Zig is Competitive but Has Trade-offs
### 3. Zig is Now Highly Competitive âš¡

- Consistent ~3ms startup overhead
- Good for medium-large files (3-6x faster than Python)
- Less optimized than Go for very large inputs
After recent optimizations:
- **39.1x faster** than Python for very large files
- **8.7x faster** for large files (32KB)
- Faster startup than Go (~2ms vs ~4ms)
- Best balance of startup time and throughput

### 4. Process Spawn Overhead Matters

CLI tools (Go, Zig) have ~3-4ms process spawn overhead:
CLI tools (Go, Zig) have process spawn overhead:
- Go: ~4ms startup overhead
- Zig: ~2ms startup overhead
- Dominates for small inputs (makes them appear slower than Python!)
- Negligible for large inputs where actual work dominates
- Rust extension avoids this entirely by being a native Python module
Expand All @@ -85,9 +89,9 @@ CLI tools (Go, Zig) have ~3-4ms process spawn overhead:

| Use Case | Recommended | Why |
|----------|-------------|-----|
| Python library calls | **Rust** (`pip install json2xml[fast]`) | 28x faster, no overhead |
| Small files via CLI | **Rust** via Python | CLI overhead dominates |
| Large files via CLI | **Go** (json2xml-go) | 48x faster for 300KB+ |
| Python library calls | **Rust** (`pip install json2xml[fast]`) | 25-35x faster, no overhead |
| Small files via CLI | **Zig** (json2xml-zig) | Fastest startup (~2ms) |
| Large files via CLI | **Go** or **Zig** | Both excellent (Go slightly faster) |
| Batch processing | **Go** or **Rust** | Both excellent |
| Pure Python required | **Python** (json2xml) | Always available |

Expand All @@ -104,7 +108,7 @@ pip install json2xml[fast]
go install github.com/vinitkumar/json2xml-go@latest

# Zig CLI
# See: github.com/nicholasgriffintn/json2xml-zig
# See: github.com/vinitkumar/json2xml-zig
```

## Running the Benchmarks
Expand All @@ -130,4 +134,4 @@ python benchmark_multi_python.py
## Related Projects

- **Go version**: [github.com/vinitkumar/json2xml-go](https://github.com/vinitkumar/json2xml-go)
- **Zig version**: [github.com/nicholasgriffintn/json2xml-zig](https://github.com/nicholasgriffintn/json2xml-zig)
- **Zig version**: [github.com/vinitkumar/json2xml-zig](https://github.com/vinitkumar/json2xml-zig)
9 changes: 4 additions & 5 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,6 @@ dependencies = [
"defusedxml",
"urllib3",
"xmltodict>=0.12.0",
"pytest",
"pytest-cov",
"coverage",
"setuptools",
]

[project.urls]
Expand All @@ -47,8 +43,11 @@ json2xml-py = "json2xml.cli:main"
include = ["json2xml"]

[project.optional-dependencies]
test = [
dev = [
"pytest>=8.4.1",
"pytest-cov",
"coverage",
"setuptools",
]
fast = ["json2xml-rs>=0.1.0"]

Expand Down
8 changes: 6 additions & 2 deletions rust/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,14 @@ license = "Apache-2.0"

[lib]
name = "json2xml_rs"
crate-type = ["cdylib"]
crate-type = ["cdylib", "rlib"]

[features]
default = ["python"]
python = ["pyo3/extension-module", "dep:pyo3"]

[dependencies]
pyo3 = { version = "0.27", features = ["extension-module"] }
pyo3 = { version = "0.27", optional = true }

[profile.release]
lto = true
Expand Down
51 changes: 51 additions & 0 deletions rust/fuzz/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
[package]
name = "json2xml_rs-fuzz"
version = "0.0.0"
publish = false
edition = "2021"

[package.metadata]
cargo-fuzz = true

[dependencies]
libfuzzer-sys = "0.4"
arbitrary = { version = "1", features = ["derive"] }

[dependencies.json2xml_rs]
path = ".."
default-features = false

[[bin]]
name = "fuzz_escape_xml"
path = "fuzz_targets/fuzz_escape_xml.rs"
test = false
doc = false
bench = false

[[bin]]
name = "fuzz_wrap_cdata"
path = "fuzz_targets/fuzz_wrap_cdata.rs"
test = false
doc = false
bench = false

[[bin]]
name = "fuzz_is_valid_xml_name"
path = "fuzz_targets/fuzz_is_valid_xml_name.rs"
test = false
doc = false
bench = false

[[bin]]
name = "fuzz_make_valid_xml_name"
path = "fuzz_targets/fuzz_make_valid_xml_name.rs"
test = false
doc = false
bench = false

[[bin]]
name = "fuzz_make_attr_string"
path = "fuzz_targets/fuzz_make_attr_string.rs"
test = false
doc = false
bench = false
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@


Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ë
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
]I]]
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
£¶¢§
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
]]J(
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
]](
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
]J(
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
]
23 changes: 23 additions & 0 deletions rust/fuzz/fuzz_targets/fuzz_escape_xml.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#![no_main]

use libfuzzer_sys::fuzz_target;
use json2xml_rs::escape_xml;

fuzz_target!(|data: &str| {
let result = escape_xml(data);

// Verify invariants:
// 1. Result should not contain unescaped special chars
assert!(!result.contains('&') || result.contains("&") || result.contains(""")
|| result.contains("'") || result.contains("<") || result.contains(">"));

// 2. Result should be valid (no panics occurred)
// 3. If input had no special chars, output equals input
if !data.contains('&') && !data.contains('"') && !data.contains('\'')
&& !data.contains('<') && !data.contains('>') {
assert_eq!(result, data);
}

// 4. Output length should be >= input length (escaping only adds chars)
assert!(result.len() >= data.len());
});
33 changes: 33 additions & 0 deletions rust/fuzz/fuzz_targets/fuzz_is_valid_xml_name.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#![no_main]

use libfuzzer_sys::fuzz_target;
use json2xml_rs::is_valid_xml_name;

fuzz_target!(|data: &str| {
let result = is_valid_xml_name(data);

// Verify invariants:
// 1. Empty string is always invalid
if data.is_empty() {
assert!(!result);
}

// 2. String starting with digit is invalid
if let Some(first) = data.chars().next() {
if first.is_ascii_digit() {
assert!(!result);
}
}

// 3. String starting with "xml" (case-insensitive) is invalid
if data.to_lowercase().starts_with("xml") {
assert!(!result);
}

// 4. String containing spaces is invalid
if data.contains(' ') {
assert!(!result);
}

// 5. Function should never panic - reaching here means it didn't
});
42 changes: 42 additions & 0 deletions rust/fuzz/fuzz_targets/fuzz_make_attr_string.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#![no_main]

use libfuzzer_sys::fuzz_target;
use arbitrary::Arbitrary;
use json2xml_rs::make_attr_string;

#[derive(Arbitrary, Debug)]
struct AttrInput {
attrs: Vec<(String, String)>,
}

fuzz_target!(|input: AttrInput| {
let result = make_attr_string(&input.attrs);

// Verify invariants:
// 1. Empty attrs should produce empty string
if input.attrs.is_empty() {
assert!(result.is_empty());
return;
}

// 2. Result should start with space (for XML formatting)
assert!(result.starts_with(' '), "Attribute string should start with space");

// 3. Each attribute should produce a ` key="value"`-like fragment.
// We check for the more specific pattern ` {key}="` to avoid
// passing on overlapping keys (e.g. "a" vs "aa") or malformed formatting.
for (key, _value) in &input.attrs {
let expected_fragment = format!(" {}=\"", key);
assert!(
result.contains(&expected_fragment),
"Attribute fragment '{}' should appear in result '{}'",
expected_fragment,
result
);
}

// 4. Values should be escaped (no raw & < > " ' in values)
// The make_attr_string calls escape_xml on values

// 5. Function should never panic - reaching here means it didn't
});
Loading
Loading