feat(embeddings): add native Ollama provider for local embeddings by rothnic · Pull Request #73 · PatrickSys/codebase-context

rothnic · 2026-03-11T22:30:24Z

Summary

This PR adds native Ollama support for codebase-context, enabling privacy-first local or self-hosted embedding generation as an alternative to OpenAI cloud embeddings. It addresses Issue #70 for custom OpenAI-compatible API endpoints.

What Changed

New Features

Native Ollama Provider (src/embeddings/ollama.ts): Full integration with Ollama's /api/embeddings endpoint
Multi-model support: nomic-embed-text, embeddinggemma, mxbai-embed-large, all-minilm
Custom model support: EMBEDDING_DIMENSIONS env var for models not in lookup tables
Environment variables:
- EMBEDDING_PROVIDER=ollama
- OLLAMA_HOST=http://localhost:11434 (or remote server)
- EMBEDDING_MODEL=nomic-embed-text
- EMBEDDING_DIMENSIONS=768 (optional override)

Bug Fixes

Fixed eager transformers loading: Removed static re-export that caused hangs
Fixed OLLAMA_HOST bypass: Properly respects env var in programmatic usage

Configuration Examples

# Local Ollama
EMBEDDING_PROVIDER=ollama EMBEDDING_MODEL=nomic-embed-text npx codebase-context reindex

# Remote Ollama server
EMBEDDING_PROVIDER=ollama OLLAMA_HOST=http://server:11434 EMBEDDING_MODEL=nomic-embed-text npx codebase-context reindex

# Custom model with explicit dimensions
EMBEDDING_PROVIDER=ollama EMBEDDING_MODEL=my-model EMBEDDING_DIMENSIONS=1024 npx codebase-context reindex

Files Changed

src/embeddings/ollama.ts (new)
src/embeddings/index.ts (lazy loading fix, dimension lookup)
src/embeddings/types.ts (OLLAMA_HOST support)
README.md (documentation)
CHANGELOG.md (feature entry)

Testing

All tests pass. Provider tested with nomic-embed-text and embeddinggemma models on remote Ollama server.

Closes #70

Add support for custom OpenAI-compatible API endpoints via OPENAI_BASE_URL environment variable. This enables using: - Ollama for local LLM inference - LiteLLM Proxy for unified model access - Groq, OpenRouter, and other OpenAI-compatible providers - Self-hosted models (vLLM, text-generation-inference) Changes: - Read OPENAI_BASE_URL from environment in DEFAULT_EMBEDDING_CONFIG - Update README.md with configuration documentation - Update CHANGELOG.md with feature entry Fixes PatrickSys#70

Add full support for Ollama as an embedding provider, enabling local embeddings without cloud dependencies. New Features: - New OllamaEmbeddingProvider class (src/embeddings/ollama.ts) - EMBEDDING_PROVIDER=ollama option - OLLAMA_HOST environment variable (default: http://localhost:11434) - Automatic dimension detection for common Ollama models: - nomic-embed-text: 768 dimensions (default) - mxbai-embed-large: 1024 dimensions - all-minilm: 384 dimensions - Also adds OPENAI_BASE_URL for custom OpenAI-compatible endpoints Files Changed: - src/embeddings/ollama.ts: New Ollama provider implementation - src/embeddings/index.ts: Add Ollama provider integration - src/embeddings/types.ts: Add OLLAMA_HOST support, dynamic apiEndpoint - README.md: Document Ollama configuration options - CHANGELOG.md: Update with feature details Tested with nomic-embed-text generating 768-dimensional embeddings. Closes PatrickSys#70 Related to PatrickSys#68

greptile-apps · 2026-03-11T22:33:14Z

Greptile Summary

This PR adds a native Ollama embedding provider to codebase-context, enabling privacy-first local or self-hosted embedding generation as an alternative to OpenAI cloud embeddings. It also fixes a module-hang bug caused by eager loading of the heavy transformers.js module, and extends the EmbeddingConfig with OLLAMA_HOST and OPENAI_BASE_URL support.

Key changes and issues:

New src/embeddings/ollama.ts: Clean implementation of the EmbeddingProvider interface using Ollama's /api/embeddings endpoint, with sequential batch processing and text truncation to respect model context windows.
src/embeddings/index.ts — lazy loading partially broken: The eager-loading fix removes the old export * from './transformers.js' but then re-adds export { TransformersEmbeddingProvider, MODEL_CONFIGS } from './transformers.js' at line 98. Static re-exports are resolved at module load time in ES modules, so this line will still cause transformers.js to be loaded eagerly whenever index.ts is imported, defeating the stated fix.
src/embeddings/index.ts — OLLAMA_HOST bypassed programmatically: The apiEndpoint getter on DEFAULT_EMBEDDING_CONFIG is evaluated during object spread, using the default provider (usually 'transformers'). If getEmbeddingProvider({ provider: 'ollama' }) is called programmatically while EMBEDDING_PROVIDER is not set to 'ollama', OLLAMA_HOST is silently ignored and the hardcoded fallback 'http://localhost:11434' is used instead.
embeddinggemma missing from model maps: Despite being the primary tested model, embeddinggemma is absent from the dimension and context-window lookup tables in both ollama.ts and index.ts. It currently works by falling through to the 768-dimension default, but this is fragile.
OLLAMA_TEST_RESULTS.md appears to be a developer scratch file; consider whether it belongs in the repository long-term.

Confidence Score: 2/5

Not safe to merge as-is — the lazy loading fix is broken by a static re-export, and OLLAMA_HOST can be silently ignored in programmatic usage.
Two logic-level issues need resolution: (1) the re-added static re-export at index.ts line 98 effectively reverts the core bug fix this PR claims to make, and (2) the OLLAMA_HOST env var is bypassed when the provider is specified programmatically. These are not edge-case concerns — the first affects every consumer of the package when using the Ollama provider, and the second affects any programmatic API user. The provider implementation itself (ollama.ts) is clean and well-structured, which keeps the score above 1.
Pay close attention to src/embeddings/index.ts — specifically line 98 (static re-export defeating lazy loading) and lines 74-84 (OLLAMA_HOST fallback logic).

Important Files Changed

Filename	Overview
src/embeddings/ollama.ts	New Ollama embedding provider with sequential batch processing, text truncation, and model dimension detection. Missing `embeddinggemma` from model lookup tables despite being the primary tested model.
src/embeddings/index.ts	Lazy loading fix is undermined by the static re-export of TransformersEmbeddingProvider/MODEL_CONFIGS at line 98; also, OLLAMA_HOST env var is bypassed when provider is passed programmatically without EMBEDDING_PROVIDER being set.
src/embeddings/types.ts	Adds OLLAMA_HOST and OPENAI_BASE_URL support via a getter on DEFAULT_EMBEDDING_CONFIG; getter-based approach has a subtle spread-evaluation pitfall that contributes to the OLLAMA_HOST bypass issue.
README.md	Documentation updated with new EMBEDDING_PROVIDER options, OPENAI_BASE_URL, and OLLAMA_HOST env vars — accurate and complete.
CHANGELOG.md	Standard changelog entry for the new Ollama provider feature under [Unreleased].
OLLAMA_TEST_RESULTS.md	New test results document; useful internal reference but likely not intended for permanent inclusion in the repository.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant index.ts
    participant types.ts
    participant ollama.ts
    participant OllamaServer

    Caller->>index.ts: getEmbeddingProvider({ provider: 'ollama' })
    index.ts->>types.ts: spread DEFAULT_EMBEDDING_CONFIG (getter evaluates apiEndpoint)
    types.ts-->>index.ts: apiEndpoint (may be undefined if EMBEDDING_PROVIDER≠ollama)
    index.ts->>ollama.ts: dynamic import OllamaEmbeddingProvider
    index.ts->>ollama.ts: new OllamaEmbeddingProvider(model, apiEndpoint || 'localhost:11434')
    ollama.ts-->>index.ts: provider instance
    index.ts->>ollama.ts: provider.initialize()
    ollama.ts-->>index.ts: ready

    Caller->>index.ts: provider.embedBatch(texts)
    loop for each text (sequential)
        index.ts->>ollama.ts: truncateText(text)
        ollama.ts->>OllamaServer: POST /api/embeddings { model, prompt }
        OllamaServer-->>ollama.ts: { embedding: number[] }
        ollama.ts-->>index.ts: embeddings.push(embedding)
    end
    index.ts-->>Caller: number[][]

_{Last reviewed commit: 0986581}

greptile-apps · 2026-03-11T22:33:17Z

src/embeddings/index.ts

+
+// Re-export TransformersEmbeddingProvider and MODEL_CONFIGS for consumers who need them
+// These will trigger transformers loading, but only when explicitly imported
+export { TransformersEmbeddingProvider, MODEL_CONFIGS } from './transformers.js';


Static re-export defeats the lazy loading fix

The stated purpose of this PR's index.ts change is to prevent transformers.js from being eagerly loaded when using non-transformers providers. The old export * from './transformers.js' was removed for exactly this reason. However, line 98 introduces the same problem via a named re-export:

export { TransformersEmbeddingProvider, MODEL_CONFIGS } from './transformers.js';

In ES modules, re-exports are resolved statically at module load time — not lazily. Any code that does import { ... } from './index.js' (which includes virtually everything that uses this package) will trigger transformers.js to be loaded immediately, regardless of whether getEmbeddingProvider is ever called with a transformers config. This means the hang observed when using the Ollama provider will still occur for any consumer that imports from this index module.

The getEmbeddingProvider function already correctly uses await import('./transformers.js') for lazy loading. If TransformersEmbeddingProvider and MODEL_CONFIGS must remain re-exported for downstream consumers, those consumers should import directly from './transformers.js' rather than via the index, or this line should be removed and documented as a breaking change for consumers relying on the re-export.

greptile-apps · 2026-03-11T22:33:18Z

src/embeddings/index.ts

  if (mergedConfig.provider === 'ollama') {
-    console.warn('Ollama provider not yet implemented, falling back to Transformers.js');
+    const { OllamaEmbeddingProvider } = await import('./ollama.js');
+    const provider = new OllamaEmbeddingProvider(
+      mergedConfig.model || 'nomic-embed-text',
+      mergedConfig.apiEndpoint || 'http://localhost:11434'
+    );
+    await provider.initialize();
+    cachedProvider = provider;
+    cachedProviderType = providerKey;
+    return provider;
  }


OLLAMA_HOST ignored when provider is passed programmatically

DEFAULT_EMBEDDING_CONFIG defines apiEndpoint as a getter that calls getDefaultApiEndpoint(this.provider). When getEmbeddingProvider spreads this config:

const mergedConfig = { ...DEFAULT_EMBEDDING_CONFIG, ...config };

The spread operator evaluates the getter at spread time, with this bound to DEFAULT_EMBEDDING_CONFIG. So this.provider equals DEFAULT_EMBEDDING_CONFIG.provider, which is derived from process.env.EMBEDDING_PROVIDER ?? 'transformers'.

If EMBEDDING_PROVIDER is not set (defaults to 'transformers'), the getter returns undefined for apiEndpoint. This means calling getEmbeddingProvider({ provider: 'ollama' }) programmatically will always fall back to the hardcoded 'http://localhost:11434' on line 78, silently ignoring the OLLAMA_HOST environment variable.

A straightforward fix is to read OLLAMA_HOST directly within the provider branch:

const endpoint = mergedConfig.apiEndpoint || process.env.OLLAMA_HOST || 'http://localhost:11434'; const provider = new OllamaEmbeddingProvider( mergedConfig.model || 'nomic-embed-text', endpoint );

greptile-apps · 2026-03-11T22:33:20Z

src/embeddings/ollama.ts

+const MODEL_CONTEXT_WINDOWS: Record<string, number> = {
+  'nomic-embed-text': 2048,
+  'nomic-embed-text:latest': 2048,
+  'mxbai-embed-large': 512,
+  'mxbai-embed-large:latest': 512,
+  'all-minilm': 512,
+  'all-minilm:latest': 512
+};
+
+// Conservative character limit (approx 2 chars per token for code)
+// Code has more tokens per character due to punctuation and symbols
+function getMaxChars(modelName: string): number {
+  const tokens = MODEL_CONTEXT_WINDOWS[modelName] || 2048;
+  return tokens * 2; // Very conservative: 2 chars per token
+}
+
+/**
+ * Ollama Embedding Provider
+ * Supports local embedding models via Ollama API.
+ * API endpoint: POST /api/embeddings
+ */
+export class OllamaEmbeddingProvider implements EmbeddingProvider {
+  readonly name = 'ollama';
+  private maxChars: number;
+
+  // Default dimensions for nomic-embed-text (768)
+  // Override via EMBEDDING_MODEL env var for other models
+  get dimensions(): number {
+    // Common Ollama embedding model dimensions
+    const modelDimensions: Record<string, number> = {
+      'nomic-embed-text': 768,
+      'nomic-embed-text:latest': 768,
+      'mxbai-embed-large': 1024,
+      'mxbai-embed-large:latest': 1024,
+      'all-minilm': 384,
+      'all-minilm:latest': 384
+    };
+    return modelDimensions[this.modelName] || 768;


embeddinggemma missing from model lookup tables

The PR description and OLLAMA_TEST_RESULTS.md both highlight embeddinggemma as a first-class supported and tested model. However, it is absent from both MODEL_CONTEXT_WINDOWS and the modelDimensions map in the dimensions getter. Unknown models silently fall back to 768 dimensions and 2048 token context. If embeddinggemma's actual values differ from these defaults in a future Ollama version, users will get silent LanceDB schema mismatches during re-indexing.

The same gap exists in getConfiguredDimensions in index.ts (line 34–42). Consider adding an explicit entry:

const MODEL_CONTEXT_WINDOWS: Record<string, number> = { 'nomic-embed-text': 2048, 'nomic-embed-text:latest': 2048, 'embeddinggemma': 2048, // add 'embeddinggemma:latest': 2048, // add 'mxbai-embed-large': 512, ... };

rothnic · 2026-03-11T22:35:06Z

Going to clean this up a bit still and remove the testing doc, etc.

…provider - Add context window-aware text truncation to prevent API errors - Implement conservative 2 chars/token ratio for code truncation - Fix eager transformers loading that caused hangs with Ollama - Move MODEL_CONFIGS inline to avoid importing heavy transformers module - Add support for model-specific context windows (nomic-embed-text, mxbai, etc.)

rothnic · 2026-03-11T22:44:18Z

Test Results - ollama server with p40 Video Card

Tested both embedding models on the same project (60 files, 188 chunks):

Performance Comparison

Model	Indexing Time	Throughput	Notes
nomic-embed-text	19 seconds	~9.8 chunks/sec	Fast, optimized for embeddings
embeddinggemma	199 seconds	~0.9 chunks/sec	10x slower, general-purpose

Search Quality Examples

nomic-embed-text results:

"scrape website" → Found Firecrawl scraping components (confidence: 0.75)
"fetch data" → Found API testing code (confidence: 0.99)
"error handling" → Found try/catch blocks (confidence: 1.00)
"authentication" → Found auth components (confidence: 0.98)

Both models produce good results, but nomic-embed-text is significantly faster on the same hardware. This aligns with its design as a dedicated embedding model vs embeddinggemma's general-purpose architecture.

Configuration Used

EMBEDDING_PROVIDER=ollama
OLLAMA_HOST=http://<ollama host>:11434
EMBEDDING_MODEL=nomic-embed-text  # or embeddinggemma

The OLLAMA_HOST fix from the code review is working correctly - the environment variable is properly respected when set.

…fix OLLAMA_HOST, add embeddinggemma

rothnic · 2026-03-11T22:55:07Z

Update: EMBEDDING_DIMENSIONS Support

Added support for EMBEDDING_DIMENSIONS environment variable to handle custom models not in the hardcoded lookup tables.

Usage

# Use a custom model with explicit dimensions
EMBEDDING_PROVIDER=ollama \
EMBEDDING_MODEL=my-custom-model \
EMBEDDING_DIMENSIONS=1024 \
npx codebase-context reindex

This addresses the review feedback about unknown models falling back to 768 dimensions silently. Now users can:

Use models not explicitly listed in the code
Override dimensions if Ollama updates a model's output size
Prevent LanceDB schema mismatches during re-indexing

The env var is checked in both:

OllamaEmbeddingProvider.dimensions getter (for the provider instance)
getConfiguredDimensions() (for LanceDB validation)

Both locations check process.env.EMBEDDING_DIMENSIONS first before falling back to lookup tables or defaults.

rothnic added 2 commits March 11, 2026 10:16

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

rothnic marked this pull request as draft March 11, 2026 22:34

rothnic force-pushed the feature/openai-base-url-support branch from 0986581 to 170758f Compare March 11, 2026 22:42

fix(embeddings): address code review feedback - remove eager export, …

8ecc514

…fix OLLAMA_HOST, add embeddinggemma

rothnic force-pushed the feature/openai-base-url-support branch from 170758f to 8ecc514 Compare March 11, 2026 22:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(embeddings): add native Ollama provider for local embeddings#73

feat(embeddings): add native Ollama provider for local embeddings#73
rothnic wants to merge 4 commits intoPatrickSys:masterfrom
rothnic:feature/openai-base-url-support

rothnic commented Mar 11, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 11, 2026

Important Files Changed

Uh oh!

greptile-apps bot Mar 11, 2026

Uh oh!

greptile-apps bot Mar 11, 2026

Uh oh!

greptile-apps bot Mar 11, 2026

Uh oh!

rothnic commented Mar 11, 2026

Uh oh!

rothnic commented Mar 11, 2026 •

edited

Loading

Uh oh!

rothnic commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rothnic commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

New Features

Bug Fixes

Configuration Examples

Files Changed

Testing

Uh oh!

greptile-apps bot commented Mar 11, 2026

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

rothnic commented Mar 11, 2026

Uh oh!

rothnic commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results - ollama server with p40 Video Card

Performance Comparison

Search Quality Examples

Configuration Used

Uh oh!

rothnic commented Mar 11, 2026

Update: EMBEDDING_DIMENSIONS Support

Usage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rothnic commented Mar 11, 2026 •

edited

Loading

rothnic commented Mar 11, 2026 •

edited

Loading