Use it for fast pre-implementation sizing work, such as:
- early architecture decisions;
- comparing vector dimensions;
- comparing engines;
- comparing index types;
- estimating metadata/payload impact;
- generating Markdown/CSV/JSON artifacts for architecture discussions.
- No live database connections.
- No ingestion or load execution.
- No latency/recall benchmarking.
- No pricing calculations.
- No production guarantee.
Run directly from PyPI with uvx:
uvx vector-db-sizer --help
uvx vector-db-sizer list-enginesname: qdrant_text_hnsw
dataset:
source_type: text
total_tokens: 50000000
chunk_tokens: 512
chunk_overlap: 64
embedding:
kind: dense
dimensions: 1536
dtype: float32
database:
engine: qdrant
index_type: hnswuvx vector-db-sizer validate scenario.yaml
uvx vector-db-sizer estimate scenario.yaml --format markdown --out report.mduv run vector-db-sizer estimate examples/qdrant_text_hnsw.yaml --format markdownuv run vector-db-sizer estimate examples/multi_scenario.yaml --format csv
uv run vector-db-sizer estimate examples/multi_scenario.yaml --format jsonjson(machine-readable)markdown(human report)csv(comparison table)
- generic
- pgvector
- qdrant
- milvus
- elasticsearch
- opensearch
- weaviate
- pinecone
- Raw vectors: uncompressed/base vector bytes.
- Quantized vectors: additional quantized representation when modeled.
- Record payload: IDs + metadata/text/provenance payload bytes.
- Index disk: index structure bytes on disk.
- Engine overhead: engine/profile-level overhead approximation.
- Final disk estimate: replicated storage plus WAL/snapshot/safety factors.
- Final RAM estimate: vectors + payload + index + overhead RAM approximation.
- Warnings: profile caveats and scenario assumptions to review.
- Confidence: per-component confidence levels for planning.
high: formulaic or type-level estimate.medium: useful engineering approximation.low: heuristic and engine-dependent; validate with pilot load.
The estimates are analytical and should be calibrated with a representative pilot load before production capacity planning.
uv sync
uv run pytest
uv run ruff check .- Engine profiles are approximate.
- No vendor pricing model.
- No actual DB measurements from live systems.
- No latency/recall estimation.
- No automatic database selection.
