Skip to content

Collaboration proposal: Rust-powered QVD engine (qvdrs) — up to 350x faster #14

@bintocher

Description

@bintocher

Hi Constantin,

First of all — great work on PyQvd! It's the most well-known Python library for QVD files, and the API design is clean and well-documented.

I'm the author of qvdrs — a QVD library with the core engine written in Rust and Python bindings via PyO3 + Arrow zero-copy bridge (pip install qvdrs). I'd like to explore the possibility of collaboration.

Performance comparison

Benchmarks on the same machine with real QVD files, PyQvd 2.3.2 vs qvdrs 0.5.0:

Read

File Rows Cols PyQvd qvdrs Speedup
11 KB 12 4 0.013s 0.000s 29x
62 KB 125 45 0.012s 0.001s 11x
2.3 MB 21,523 10 0.214s 0.011s 20x
35 MB 1,695,048 7 5.96s 0.26s 23x
480 MB 11,994,296 4 64.5s 2.1s 31x
560 MB 5,458,618 24 65.2s 3.9s 17x
1.7 GB 87,617,047 8 >10 min (killed) 23.4s >25x

Write

File Rows Cols PyQvd qvdrs Speedup
35 MB 1,695,048 7 7.8s 0.022s 351x
480 MB 11,994,296 4 50.9s 0.61s 83x

Features only in qvdrs

Feature qvdrs
Streaming EXISTS() filtered read: 1.7GB, 87.6M rows -> 20.4M rows x 3 cols 9.0s
EXISTS() + save to QVD 13.3s
Parquet <-> QVD conversion yes
DuckDB native integration (register QVD as SQL tables) yes
DataFusion SQL queries on QVD yes
Arrow RecordBatch zero-copy (pandas, Polars, DuckDB) yes
CLI tool (convert, inspect, head, filter) yes
Binary-identical output to Qlik Sense (MD5 verified) yes

What each project brings

PyQvd — mature, clean API with QvdTable (filter_by, join, sort, append, insert), 25 stars, established user base, good documentation on readthedocs, pure Python — easy to understand and debug.

qvdrs — Rust core (17-350x faster), handles multi-GB files, streaming reader, EXISTS() filter (2.5x faster than Qlik Sense), Parquet/Arrow/DuckDB/DataFusion integration, binary-identical QVD output to Qlik Sense.

Development approach

The qvdrs codebase is developed with the help of Claude (Opus) — Anthropic's AI coding assistant. This significantly accelerates development: implementing new features, writing tests, debugging, and maintaining code quality. If you're open to collaboration, this is a powerful tool that can be used for joint development as well — it handles Rust, Python, and the QVD binary format equally well.

Proposal

A few possible directions:

  1. Contribute to qvdrs — your QVD format expertise and API design skills would be very valuable. We could adopt PyQvd's richer QvdTable API (filter_by, join, sort, etc.) for the Python bindings.
  2. qvdrs as an optional backend for PyQvd — keep PyQvd's API, but optionally use qvdrs for I/O (similar to how pandas uses pyarrow). Users get the familiar API with Rust performance.
  3. Joint development — combine efforts under one project.

No pressure at all — just reaching out since we're both solving the same problem. Happy to discuss.

Stanislav
https://github.com/bintocher/qvdrs

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestquestionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions