Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
name: CI

on:
push:
branches: ["main"]
pull_request:

jobs:
quality:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [20]

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
cache: npm

- name: Install dependencies
run: npm ci

- name: Build
run: npm run build

- name: Lint
run: npm run lint

- name: Test
run: npm test -- --runInBand

- name: Coverage Threshold
run: npm run test:coverage
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.

## [Unreleased]

### Added
- CI workflow for build, lint, test, and coverage.
- Persistence-focused test coverage for IndexedDB-backed index behavior.
- API reference and tuning guidance in README.

### Changed
- Lint gate now targets published library sources (`src/**`, excluding benchmark CLI code).
- README persistence example now loads from the same DB name it saved to.

### Fixed
- `HNSWWithDB.deleteIndex()` now awaits DB re-initialization.
- `HNSWWithDB` now surfaces initialization/load/delete errors instead of silently swallowing them.
3 changes: 2 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@ Thanks for taking the time to improve this HNSW implementation. The checklist be
- Add or update tests under `tests/` that capture the bug fix or feature so regressions are caught automatically. The `tests/HNSW.test.ts` suite shows how to build deterministic indices for verification.
- Before committing, run the full set of quality gates:
- `npm test` – runs the Jest harness
- `npm run lint` – checks the TypeScript sources with TSLint
- `npm run lint` – checks published TypeScript sources with TSLint
- `npm run lint:bench` – optional lint pass for benchmark CLI sources
- `npm run build` – ensures the TypeScript compiler can emit the distributable files

## 4. Commit with context
Expand Down
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2026 deepfates.com

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
53 changes: 51 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# HNSW

[![npm version](https://img.shields.io/npm/v/hnsw)](https://www.npmjs.com/package/hnsw)
[![license](https://img.shields.io/npm/l/hnsw)](./LICENSE)

This is a small Typescript package that implements the Hierarchical Navigable Small Worlds algorithm for approximate nearest neighbor search.

I wrote this package because I wanted to do efficient vector search directly in the client browser. All the other implementations I found for TS were either bindings for libraries written in other languages, or dealt with WASM compilation complexity.
Expand Down Expand Up @@ -61,8 +64,8 @@ const data = [
await index.buildIndex(data);
await index.saveIndex();

// Load the index
const index2 = await HNSWWithDB.create(16, 200, 'my-index-2', 50);
// Load the same index from disk
const index2 = await HNSWWithDB.create(16, 200, 'my-index', 50);
await index2.loadIndex();

// Search for nearest neighbors
Expand All @@ -77,6 +80,52 @@ Notes:
- The `metric` determines how scores are computed: `cosine` uses cosine similarity and `euclidean` uses an inverse-distance similarity (higher is better in both cases).
- `efSearch` controls query-time exploration and should be at least `k` for best recall.

## API Reference

### `new HNSW(M, efConstruction, d?, metric?, efSearch?)`

- `M`: Max neighbors stored per node and layer. Higher values usually improve recall and memory cost.
- `efConstruction`: Build-time exploration depth. Higher values improve index quality and build time cost.
- `d`: Vector dimension. If omitted, inferred from first inserted vector.
- `metric`: `cosine` or `euclidean`.
- `efSearch`: Query-time exploration depth. Higher values improve recall and query latency cost.

### `buildIndex(data, options?)`

- `data`: Array of `{ id, vector }`.
- `options.onProgress(current, total)`: Optional progress callback.
- `options.progressInterval`: Callback cadence (default `10000`).

### `searchKNN(query, k, options?)`

- Returns up to `k` results with shape `{ id, score }`.
- `options.efSearch`: Per-query override. Effective search breadth is `max(k, efSearch)`.

### `toJSON()` / `HNSW.fromJSON(json)`

- Serialize and restore in-memory indices for transport or persistence.

### `HNSWWithDB.create(M, efConstruction, dbName, efSearch?)`

- Creates an IndexedDB-backed index (browser/runtime with IndexedDB support).
- `saveIndex()`: Persist current graph.
- `loadIndex()`: Load previously persisted graph (no-op if missing).
- `deleteIndex()`: Delete persisted graph and reinitialize DB.
- `close()`: Close the active IndexedDB connection.

## Tuning Guide

- Start with `M=16`, `efConstruction=200`, `efSearch=50`.
- Increase `efSearch` first when recall is too low.
- Increase `M` for tougher datasets when memory budget allows.
- Keep `efSearch >= k` for better recall consistency.

## Limitations

- This implementation prioritizes simplicity over peak throughput and memory efficiency.
- IndexedDB support depends on environment support for IndexedDB APIs.
- Benchmark tools under `src/bench` are maintained as CLI utilities and are not part of the runtime API surface.

## Benchmarks

A lightweight benchmark harness is available to validate recall/latency tradeoffs and the impact of parameters like `efSearch`, `M`, and `efConstruction`.
Expand Down
20 changes: 14 additions & 6 deletions jestconfig.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
{
"transform": {
"^.+\\.(t|j)sx?$": "ts-jest"
},
"testRegex": "(/tests/.*|(\\.|/)(test|spec))\\.(jsx?|tsx?)$",
"moduleFileExtensions": ["ts", "tsx", "js", "jsx", "json", "node"]
}
"transform": {
"^.+\\.(t|j)sx?$": "ts-jest"
},
"testRegex": "(/tests/.*|(\\.|/)(test|spec))\\.(jsx?|tsx?)$",
"moduleFileExtensions": ["ts", "tsx", "js", "jsx", "json", "node"],
"coverageThreshold": {
"global": {
"branches": 65,
"functions": 80,
"lines": 80,
"statements": 80
}
}
}
31 changes: 27 additions & 4 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 13 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
"version": "1.0.3",
"description": "A TypeScript implementation of HNSW (Hierarchical Navigable Small World) algorithm for approximate nearest neighbor search",
"homepage": "https://github.com/deepfates/hnsw#readme",
"bugs": {
"url": "https://github.com/deepfates/hnsw/issues"
},
"repository": {
"type": "git",
"url": "https://github.com/deepfates/hnsw.git"
Expand All @@ -11,13 +14,15 @@
"types": "dist/index.d.ts",
"scripts": {
"test": "jest --config jestconfig.json",
"test:coverage": "jest --config jestconfig.json --coverage --runInBand",
"build": "tsc",
"bench": "node dist/bench/run.js",
"bench:download": "node dist/bench/download.js",
"bench:report": "node dist/bench/report.js",
"bench:compare": "node dist/bench/compare.js",
"format": "prettier --write \"src/**/*.ts\"",
"lint": "tslint -p tsconfig.json",
"lint": "tslint -p tsconfig.lint.json",
"lint:bench": "tslint -p tsconfig.json src/bench/**/*.ts",
"prepare": "npm run build",
"prepublishOnly": "npm test && npm run lint",
"preversion": "npm run lint",
Expand All @@ -26,7 +31,12 @@
},
"keywords": [
"nearest neighbor",
"vector search"
"vector search",
"hnsw",
"ann",
"similarity search",
"indexeddb",
"browser"
],
"author": "deepfates.com",
"license": "MIT",
Expand All @@ -36,6 +46,7 @@
"devDependencies": {
"@types/jest": "^29.5.1",
"@types/node": "^20.11.30",
"fake-indexeddb": "^6.2.5",
"jest": "^29.5.0",
"prettier": "^2.8.8",
"ts-jest": "^29.1.0",
Expand Down
51 changes: 35 additions & 16 deletions src/db.ts
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
import { HNSW } from './main';
import { openDB, deleteDB, DBSchema, IDBPDatabase } from 'idb';
import { cosineSimilarity, euclideanSimilarity } from './similarity';

type SerializedIndex = ReturnType<HNSW['toJSON']>;

interface HNSWDB extends DBSchema {
'hnsw-index': {
key: string;
value: any;
value: SerializedIndex;
};
}

Expand All @@ -17,6 +20,9 @@ export class HNSWWithDB extends HNSW {
this.dbName = dbName;
}

/**
* Creates an IndexedDB-backed HNSW instance.
*/
static async create(M: number, efConstruction: number, dbName: string, efSearch = 50) {
const instance = new HNSWWithDB(M, efConstruction, dbName, efSearch);
await instance.initDB();
Expand All @@ -31,25 +37,39 @@ export class HNSWWithDB extends HNSW {
});
}

async saveIndex() {
/**
* Closes the current IndexedDB connection if open.
*/
close() {
if (!this.db) {
// console.error('Database is not initialized');
return;
}
this.db.close();
this.db = null;
}

/**
* Persists the current graph to IndexedDB.
*/
async saveIndex() {
if (!this.db) {
throw new Error('Database is not initialized');
}

await this.db.put('hnsw-index', this.toJSON(), 'hnsw');
}

/**
* Loads a persisted graph from IndexedDB if present.
*/
async loadIndex() {
if (!this.db) {
// console.error('Database is not initialized');
return;
throw new Error('Database is not initialized');
}

const loadedHNSW: HNSW | undefined = await this.db.get('hnsw-index', 'hnsw');
const loadedHNSW = await this.db.get('hnsw-index', 'hnsw');

if (!loadedHNSW) {
// console.error('No saved HNSW index found');
return;
}

Expand All @@ -60,23 +80,22 @@ export class HNSWWithDB extends HNSW {
this.efSearch = hnsw.efSearch;
this.metric = hnsw.metric;
this.d = hnsw.d;
this.similarityFunction = (this as any).getMetric(hnsw.metric);
this.similarityFunction = hnsw.metric === 'cosine' ? cosineSimilarity : euclideanSimilarity;
this.levelMax = hnsw.levelMax;
this.entryPointId = hnsw.entryPointId;
this.nodes = hnsw.nodes;
}

/**
* Deletes persisted graph data and re-initializes the backing DB.
*/
async deleteIndex() {
if (!this.db) {
// console.error('Database is not initialized');
return;
throw new Error('Database is not initialized');
}

try {
await deleteDB(this.dbName);
this.initDB();
} catch (error) {
// console.error('Failed to delete index:', error);
}
this.close();
await deleteDB(this.dbName);
await this.initDB();
}
}
Loading