Skip to content

[RAG v2] Phase 1.2: LLM Result Caching #36

@mikewaters

Description

@mikewaters

Feature: LLM Result Caching

Phase: 1 - Infrastructure & Core Pipeline
Module: catalog.store.llm_cache
Estimated Effort: Small (2-3 hours)

Business Value

Eliminates redundant LLM calls for repeated queries, reducing latency by 10-100x for cached results and cutting API costs. Essential for interactive search where users refine queries.

Technical Specification

New SQLAlchemy model and cache class:

class LLMCacheEntry(Base):
    __tablename__ = "llm_cache_v2"
    cache_key = Column(String, primary_key=True)
    cache_type = Column(String, nullable=False)  # 'expansion' | 'rerank'
    result_json = Column(Text, nullable=False)
    created_at = Column(DateTime, default=datetime.utcnow)

class LLMCache:
    def __init__(self, session: Session, ttl_hours: int = 168): ...
    def get_expansion(self, query: str, model: str) -> dict | None: ...
    def set_expansion(self, query: str, model: str, result: dict) -> None: ...
    def get_rerank(self, query: str, doc_hash: str, model: str) -> float | None: ...
    def set_rerank(self, query: str, doc_hash: str, model: str, score: float) -> None: ...

Acceptance Criteria

  • LLMCacheEntry SQLAlchemy model
  • LLMCache class with get/set methods for expansion and rerank
  • TTL-based expiration
  • Cache key collision resistance (SHA-256)
  • Unit tests for cache operations
  • Integration test with database

Dependencies

  • catalog.store.database (existing)

Reference

See docs/features/rag-v2/phase-1-infrastructure.md for full specification.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions