Offline semantic search engine for documentation.
Local embeddings, browser-ready indexes.
Run docmd-search instantly on any folder:
npx docmd-search ./docsThat's it.
- Files are discovered and chunked automatically
- Embeddings are generated locally (no cloud API)
- Search is available in the terminal immediately
npm install -g docmd-search# Install ML dependencies (one-time)
npm install -g @huggingface/transformers onnxruntime-nodedocmd-search ./docs # index + interactive search
docmd-search ./docs --ui # index + web UI
docmd-search --settings # configure modelDesigned to work offline, ship nothing to the browser, and stay out of your way.
- All embeddings generated locally with ONNX Runtime
- No data leaves your machine
- No cloud API keys needed
- Progressive indexing: search available from the first batch
- Incremental: only re-indexes changed files
- Resumable: interrupted indexing resumes from last checkpoint
- Browser runtime is <3KB gzipped
- No model weights in the browser
- Hybrid scoring: keyword matching + vector similarity
- Multi-batch index format with automatic compression
- Navigation tree generation for web UIs
- First-run setup wizard with model selection
- Interactive terminal search with live results
Build time (Node.js) Search time (Browser, <3KB)
─────────────────── ──────────────────────────
Crawl files Load manifest.json
→ Chunk by heading → Load batch 000 (instant)
→ Embed via ONNX → Background-load rest
→ Quantize Float32 → Int8 → Keyword + cosine
→ Compress (ternary/PQ) → Ranked results
→ Save multi-batch index
First run prompts you to select an embedding model:
| Model | Dimensions | Size | Best for |
|---|---|---|---|
| MiniLM L6 v2 ★ | 384 | ~30 MB | Fast, general purpose |
| BGE Small (English) | 384 | ~45 MB | English-optimised |
| BGE Base (English) | 768 | ~110 MB | Higher quality |
| MPNet Base v2 | 768 | ~110 MB | Multilingual |
Change model later: docmd-search --settings
No configuration is required to get started.
Global (~/.docmd-search/config.json):
{
"model": "Xenova/all-MiniLM-L6-v2",
"wizardCompleted": true
}Per-project (.docmd-search/config.json):
{
"model": "Xenova/bge-small-en-v1.5",
"chunkSize": 512,
"include": ["**/*.md"],
"exclude": ["**/drafts/**"]
}Config resolution: defaults → global → project → CLI flags.
Use in scripts or CI pipelines:
import { indexDirectory, loadAllBatches } from 'docmd-search';
const index = await indexDirectory({
rootDir: './docs',
outDir: '.docmd-search',
});Browser client:
import { load, search } from 'docmd-search/client';
await load('/path/to/.docmd-search');
const results = search('deploy kubernetes', 10);Keeps the codebase flat and modular.
src/
├── bin/docmd-search.ts # CLI entry point
├── client/index.ts # Browser runtime (<3KB)
├── config.ts # Config + model profiles
├── index-io.ts # Multi-batch format + compression
├── index.ts # Barrel exports
├── indexer/
│ ├── chunk.ts # Heading-aware chunking
│ ├── crawl.ts # File discovery
│ └── index.ts # Progressive pipeline
├── model.ts # ONNX embedding manager
├── tui.ts # Terminal UI
├── types.ts # Core types
└── ui/
└── launcher.ts # Web UI via docmd
docmd-search works standalone with any documentation project. It also integrates with docmd as a semantic search plugin.
| Tool | What it does |
|---|---|
| docmd | Zero-config documentation generator |
| docmd-search | Offline semantic search engine |
- Contributions are welcome
- If you find it useful, consider sponsoring or starring the repo ⭐
MIT License. See LICENSE for details.
