Fix: Add caching for _ZvecClient._load_collection_entries by mikewaters · Pull Request #70 · mikewaters/Substrate

mikewaters · 2026-02-16T00:04:35Z

The _load_collection_entries method was reading and JSON-parsing the full index file from disk on every invocation. During _PayloadEmbeddingIdentityStrategy.query, get_embedding_identities() triggers one read, and each subsequent per-identity _query_zvec() call triggers another. With K stored identities this resulted in K+1 full file reads and parses per search operation.

Added file modification time based caching to avoid redundant I/O and JSON parsing. The cache is invalidated when the file's mtime changes, ensuring data consistency when the file is updated.

The _load_collection_entries method was reading and JSON-parsing the full index file from disk on every invocation. During _PayloadEmbeddingIdentityStrategy.query, get_embedding_identities() triggers one read, and each subsequent per-identity _query_zvec() call triggers another. With K stored identities this resulted in K+1 full file reads and parses per search operation. Added file modification time based caching to avoid redundant I/O and JSON parsing. The cache is invalidated when the file's mtime changes, ensuring data consistency when the file is updated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Add caching for _ZvecClient._load_collection_entries#70

Fix: Add caching for _ZvecClient._load_collection_entries#70
mikewaters wants to merge 1 commit into
masterfrom
cursor/zvec-index-file-caching-3914

mikewaters commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mikewaters commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants