Objective
Develop a flexible, configurable Retrieval Augmented Generation (RAG) pipeline that can be easily customized for different projects through a JSON or YAML configuration file.
Description
We need to refactor our current RAG system into a more generalized, configuration-driven pipeline. This will allow users to easily set up and customize RAG chatbots for different knowledge bases and use cases without modifying the core code.
Configuration Structure
The configuration file (in JSON or YAML) should include the following fields:
project_id: str # Unique identifier for the project
endpoint_slug: str # Unique slug for the chat endpoint
system_prompt: str # Prompt defining chatbot behavior
knowledge_sources: # List of document sources
- type: str # e.g., 'file', 'url'
path: str # File path or URL
metadata:
label: str
tags: list[str]
keep_chat_memory: bool
collection_name: str # Vector DB collection name
embedding_model: str # Embedding model identifier
Tasks
-
Configuration Management:
- Implement a configuration parser for either JSON or YAML format
- Create a validation script to check configuration validity
- Develop a system to store and retrieve configurations (e.g., SQLite with ORM)
-
Document Ingestion Pipeline:
- Create a script to ingest documents from various sources (local files, URLs)
- Implement document parsing for different file types (PDF, TXT, HTML, etc.)
- use existing libraries where possible to avoid complexities of document parsing
- Develop a vectorization process using the specified embedding model
- Implement upsert functionality to avoid duplicates in the vector store
-
Vector Store Integration:
- Enhance vector store integration to support multiple collections
- Implement metadata storage for embedding model information
- Create a migration system for updating existing collections when configurations change
-
Chatbot Endpoint Generation:
- Develop a dynamic endpoint generation system based on
endpoint_slug
- Implement configuration-based chat processing (system prompt, chat memory, etc.)
-
Runtime Processing:
- Implement dynamic retrieval based on the configured collection and embedding model
- Develop a flexible message construction system based on configuration
-
ORM and Database Management:
- Set up an ORM (e.g., SQLAlchemy) for configuration and metadata storage
- Implement a migration system (e.g., Alembic) for database schema changes
-
Configuration Change Management:
- Develop a system to detect configuration changes (possible just re-running the validation script(s))
- Implement processes to apply changes (e.g., re-vectorizing documents, creating new collections)
-
Admin Interface (optional and probably a separate issue):
- Create a basic admin interface for managing configurations
- Implement functionality to view and edit configurations
- Add features to monitor the status of document ingestion and vectorization
Technical Considerations
- Use a robust ORM like SQLAlchemy for database interactions
- Implement async operations where possible for better performance
- Ensure proper error handling and logging throughout the pipeline
- Consider using a task queue (e.g., Celery) for long-running operations like document ingestion
- Implement proper security measures, especially for the admin interface
Acceptance Criteria
- Users can create and modify RAG chatbots through configuration files
- The system correctly ingests and vectorizes documents from various sources
- Chatbot endpoints are dynamically created based on configuration
- Configuration changes are detected and applied correctly
- The admin interface provides a clear overview of all configured chatbots and their statuses
- The system handles errors gracefully and provides clear feedback
- Performance remains acceptable even with multiple configured chatbots
Additional Notes
- Consider implementing a versioning system for configurations to allow rollbacks
- Plan for future extensibility, such as supporting multiple LLM providers or vector stores
- Ensure thorough documentation of the configuration format and system capabilities
- Consider developing a simple web interface for non-technical users to create and manage configurations
Future Enhancements
- Support for real-time document updates and incremental vectorization
- Integration with popular document management systems or cloud storage providers
- Advanced analytics and usage tracking for each configured chatbot
- A/B testing capabilities for different configurations
Objective
Develop a flexible, configurable Retrieval Augmented Generation (RAG) pipeline that can be easily customized for different projects through a JSON or YAML configuration file.
Description
We need to refactor our current RAG system into a more generalized, configuration-driven pipeline. This will allow users to easily set up and customize RAG chatbots for different knowledge bases and use cases without modifying the core code.
Configuration Structure
The configuration file (in JSON or YAML) should include the following fields:
Tasks
Configuration Management:
Document Ingestion Pipeline:
Vector Store Integration:
Chatbot Endpoint Generation:
endpoint_slugRuntime Processing:
ORM and Database Management:
Configuration Change Management:
Admin Interface (optional and probably a separate issue):
Technical Considerations
Acceptance Criteria
Additional Notes
Future Enhancements