embeddings-eval

A Node.js application for evaluating content embeddings using Vectra and OpenAI's text-embedding-3-small model.

Overview

This application supports multiple datasets with different embedding models:

default: Default project with 10 sample items
courses-de: German course content (10 items)
intranet: Intranet content (200 items)
Any dataset with valid content.json and eval.json files

The application:

Reads content from {dataset}/content.json (array of objects with title and description properties)
Uses Vectra with configurable embedding models (OpenAI, Google AI, SiliconFlow) to compute embeddings for title + " " + description
Loads evaluation data from {dataset}/eval.json (array of objects with search property)
Stores embeddings in {dataset}/embeddings/ folder with standard index structure
Searches for each search term and returns the top 3 most relevant results
Filters results based on minSimilarity threshold from model configuration

Model Configuration

Each model file contains embedding model settings and search behavior configuration:

default-model.json: OpenAI text-embedding-3-small model (minSimilarity: 0.4)
oa3large-model.json: OpenAI text-embedding-3-large model (minSimilarity: 0.5)
google-model.json: Google text-embedding-004 model with task types (minSimilarity: 0.4)
sf-model.json: SiliconFlow BAAI/bge-large-en-v1.5 model (minSimilarity: 0.4)

The minSimilarity threshold filters search results to only return matches with similarity scores >= the threshold value. Configuration is now embedded directly in the model files.

For Google models, additional parameters are supported:

generate_task_type: Task type for content embedding generation (e.g., "RETRIEVAL_DOCUMENT")
query_task_type: Task type for search query embedding (e.g., "RETRIEVAL_QUERY")

Setup

Install dependencies:

npm install

Create a .env file with your API key(s):

# For OpenAI models
OPENAI_API_KEY=your_openai_api_key_here

# For Google AI models  
GOOGLE_API_KEY=your_google_ai_api_key_here

# For SiliconFlow models
SF_API_KEY=your_siliconflow_api_key_here

Usage

Dataset and Model Selection

All commands support the --dataset parameter to specify which dataset to work with and the --model parameter to specify which embedding model to use:

Datasets:

--dataset default (default): Use default content
--dataset courses-de: Use German content
--dataset intranet: Use intranet content
Any folder with valid content.json and eval.json files

Models:

--model default (default): OpenAI text-embedding-3-small model (minSimilarity: 0.4)
--model oa3large: OpenAI text-embedding-3-large model (minSimilarity: 0.5)
--model google: Google text-embedding-004 model with task types (minSimilarity: 0.4)
--model sf: SiliconFlow BAAI/bge-large-en-v1.5 model (minSimilarity: 0.4)

Two-Step Process (Recommended)

Generate embeddings and store vectors:

npm run generate -- --dataset default --model default
# or
npm run generate -- --dataset courses-de --model oa3large
# or  
npm run generate -- --dataset intranet --model google

Run evaluation for search terms:

npm run evaluate -- --dataset default --model default
# or  
npm run evaluate -- --dataset courses-de --model oa3large
# or
npm run evaluate -- --dataset intranet --model google

Run interactive query mode:

npm run query -- --dataset intranet --model voyageai

Alternative Commands

Run the complete pipeline (generate + evaluate):

npm start -- --dataset default --model default
# or
npm start -- --dataset courses-de --model oa3large
# or
npm start -- --dataset intranet --model google

Command Details

npm run generate -- --dataset {name} --model {model}: Creates embeddings for all content in {dataset}/content.json and stores them in {dataset}/embeddings/ folder using the specified model.
npm run evaluate -- --dataset {name} --model {model}: Runs search evaluation using queries from {dataset}/eval.json against the existing vector index for the specified model.
npm run query -- --dataset {name} --model {model}: Interactive query mode that prompts for a search term and displays results in the same format as evaluate mode.
npm start -- --dataset {name} --model {model}: Runs the full pipeline (equivalent to running generate then evaluate)
npm run validate: Validates that all project JSON files exist and shows item counts
npm run validate-project {name}: Validates a specific project

Note: The -- separator is required when passing parameters through npm scripts. Alternatively, you can run the commands directly:

node index.js generate --dataset courses-de --model oa3large
node index.js evaluate --dataset default --model google
node index.js query --dataset intranet --model voyageai
node index.js --dataset default --model default  # full pipeline

Development in GitHub Codespaces

This repository is configured for GitHub Codespaces with automatic setup:

Click "Code" → "Open with Codespaces" → "New codespace"
Dependencies will be installed automatically
Add your OpenAI API key to the codespace secrets as OPENAI_API_KEY
Run the commands above

Project Structure

├── courses-en/          # English project
│   ├── content.json     # English course content
│   └── eval.json        # English search queries
├── courses-de/          # German project  
│   ├── content.json     # German course content (translated)
│   └── eval.json        # German search queries (translated)
├── index.js             # Main application
└── validate.js          # Validation script

Data Files

{project}/content.json

Contains an array of content objects with id, title, and description:

[
  {
    "id": 1,
    "title": "Introduction to Machine Learning",
    "description": "A comprehensive guide to understanding the fundamentals..."
  }
]

{project}/eval.json

Contains an array of search queries with optional expected results:

[
  {
    "search": "python programming",
    "expected": [6]
  }
]

Output

Generate Command (`npm run generate -- --dataset {name} --model {model}`)

The application will:

Build a dataset-specific vector index from the content in {dataset}/content.json
Generate embeddings using the specified model (e.g., OpenAI's text-embedding-3-small)
Store the vectors in {dataset}/embeddings-index-{model}/ with model-specific catalog files ({model}.json)
Display progress and completion status

Evaluate Command (`npm run evaluate -- --dataset {name} --model {model}`)

The application will:

Load the existing dataset and model-specific vector index from {dataset}/embeddings/
Search for each query in the {dataset}/eval.json file
Display top 3 results with similarity scores for each query, filtered by model's minSimilarity threshold
Save detailed results to evaluation-results-{model}.json

Example Output

🚀 Command: Generate embeddings and store vectors for dataset 'courses-de' using model 'oa3large'

Loading content from dataset 'courses-de' and building index...
Processing item 1/10: Einführung in Machine Learning
Processing item 2/10: JavaScript für Anfänger
...

Search: "python programmierung"
Expected: [6]
Found: [6, 1, 4]
Validation: ✅ All 1 expected results match
Top 3 results:
  1. [ID: 6, Score: 0.8542] Data Science mit Python
     Erkunden Sie Datenanalyse, Visualisierung und statistische Modellierung mit Python-Bibliotheken...

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
.devcontainer		.devcontainer
.github		.github
.vscode		.vscode
courses-de		courses-de
default		default
intranet		intranet
lib		lib
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
default-model.json		default-model.json
gemini-doc-model.json		gemini-doc-model.json
gemini-doc-re-model.json		gemini-doc-re-model.json
gemini-semantic-model.json		gemini-semantic-model.json
index.js		index.js
oa3large-model.json		oa3large-model.json
package-lock.json		package-lock.json
package.json		package.json
q34b-model.json		q34b-model.json
q38b-model.json		q38b-model.json
sf-model.json		sf-model.json
test.json		test.json
validate-project.js		validate-project.js
validate.js		validate.js
voyageai-model.json		voyageai-model.json
voyageai-reranker.json		voyageai-reranker.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

embeddings-eval

Overview

Model Configuration

Setup

Usage

Dataset and Model Selection

Two-Step Process (Recommended)

Alternative Commands

Command Details

Development in GitHub Codespaces

Project Structure

Data Files

{project}/content.json

{project}/eval.json

Output

Generate Command (`npm run generate -- --dataset {name} --model {model}`)

Evaluate Command (`npm run evaluate -- --dataset {name} --model {model}`)

Example Output

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

embeddings-eval

Overview

Model Configuration

Setup

Usage

Dataset and Model Selection

Two-Step Process (Recommended)

Alternative Commands

Command Details

Development in GitHub Codespaces

Project Structure

Data Files

{project}/content.json

{project}/eval.json

Output

Generate Command (npm run generate -- --dataset {name} --model {model})

Evaluate Command (npm run evaluate -- --dataset {name} --model {model})

Example Output

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Generate Command (`npm run generate -- --dataset {name} --model {model}`)

Evaluate Command (`npm run evaluate -- --dataset {name} --model {model}`)

Packages