Skip to content

docmd-io/docmd-search

Offline semantic search engine for documentation.
Local embeddings, browser-ready indexes.

npm version downloads stars license

docmd-search preview

Quick Start

Run docmd-search instantly on any folder:

npx docmd-search ./docs

That's it.

  • Files are discovered and chunked automatically
  • Embeddings are generated locally (no cloud API)
  • Search is available in the terminal immediately

Install for regular usage

npm install -g docmd-search
# Install ML dependencies (one-time)
npm install -g @huggingface/transformers onnxruntime-node
docmd-search ./docs          # index + interactive search
docmd-search ./docs --ui     # index + web UI
docmd-search --settings      # configure model

Features

Designed to work offline, ship nothing to the browser, and stay out of your way.

Offline by default

  • All embeddings generated locally with ONNX Runtime
  • No data leaves your machine
  • No cloud API keys needed

Instant search

  • Progressive indexing: search available from the first batch
  • Incremental: only re-indexes changed files
  • Resumable: interrupted indexing resumes from last checkpoint

Tiny client

  • Browser runtime is <3KB gzipped
  • No model weights in the browser
  • Hybrid scoring: keyword matching + vector similarity

Built-in capabilities

  • Multi-batch index format with automatic compression
  • Navigation tree generation for web UIs
  • First-run setup wizard with model selection
  • Interactive terminal search with live results

How It Works

Build time (Node.js)                    Search time (Browser, <3KB)
───────────────────                     ──────────────────────────
 Crawl files                             Load manifest.json
   → Chunk by heading                      → Load batch 000 (instant)
     → Embed via ONNX                        → Background-load rest
       → Quantize Float32 → Int8               → Keyword + cosine
         → Compress (ternary/PQ)                 → Ranked results
           → Save multi-batch index

Models

First run prompts you to select an embedding model:

Model Dimensions Size Best for
MiniLM L6 v2 384 ~30 MB Fast, general purpose
BGE Small (English) 384 ~45 MB English-optimised
BGE Base (English) 768 ~110 MB Higher quality
MPNet Base v2 768 ~110 MB Multilingual

Change model later: docmd-search --settings

Configuration (optional)

No configuration is required to get started.

Global (~/.docmd-search/config.json):

{
  "model": "Xenova/all-MiniLM-L6-v2",
  "wizardCompleted": true
}

Per-project (.docmd-search/config.json):

{
  "model": "Xenova/bge-small-en-v1.5",
  "chunkSize": 512,
  "include": ["**/*.md"],
  "exclude": ["**/drafts/**"]
}

Config resolution: defaults → global → project → CLI flags.

Programmatic Usage

Use in scripts or CI pipelines:

import { indexDirectory, loadAllBatches } from 'docmd-search';

const index = await indexDirectory({
  rootDir: './docs',
  outDir: '.docmd-search',
});

Browser client:

import { load, search } from 'docmd-search/client';

await load('/path/to/.docmd-search');
const results = search('deploy kubernetes', 10);

Project Structure

Keeps the codebase flat and modular.

src/
├── bin/docmd-search.ts   # CLI entry point
├── client/index.ts       # Browser runtime (<3KB)
├── config.ts             # Config + model profiles
├── index-io.ts           # Multi-batch format + compression
├── index.ts              # Barrel exports
├── indexer/
│   ├── chunk.ts          # Heading-aware chunking
│   ├── crawl.ts          # File discovery
│   └── index.ts          # Progressive pipeline
├── model.ts              # ONNX embedding manager
├── tui.ts                # Terminal UI
├── types.ts              # Core types
└── ui/
    └── launcher.ts       # Web UI via docmd

Part of the docmd ecosystem

docmd-search works standalone with any documentation project. It also integrates with docmd as a semantic search plugin.

Tool What it does
docmd Zero-config documentation generator
docmd-search Offline semantic search engine

Community & Support

  • Contributions are welcome
  • If you find it useful, consider sponsoring or starring the repo ⭐

License

MIT License. See LICENSE for details.

Sponsor this project

 

Contributors