refactor(phase 14): Model Endpoints Registry and Per-Partition Presets#444
refactor(phase 14): Model Endpoints Registry and Per-Partition Presets#444hedhoud wants to merge 20 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Phase 14 admin APIs for model endpoint registry, pipeline presets, and partition config: new Pydantic schemas, admin-only routers delegating to DI services, optional DI getters with 503 fallback, main app mounts (/model-endpoints, /presets), updated ports/persistence stubs, migration/schema updates, and unit tests. ChangesPhase 14 Admin APIs: Model Endpoints, Presets, and Partition Configuration
Sequence Diagram(s)sequenceDiagram
participant Client
participant ModelEndpointsRouter
participant DIGetter as get_model_endpoint_service
participant ModelEndpointService
Client->>ModelEndpointsRouter: POST /model-endpoints (CreateModelEndpointRequest)
ModelEndpointsRouter->>DIGetter: resolve service via DI
DIGetter->>ModelEndpointService: create_model_endpoint(payload)
ModelEndpointService-->>ModelEndpointsRouter: ModelEndpointResponse
ModelEndpointsRouter-->>Client: 201 Created
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
openrag/api/schemas/admin/preset_schemas.py (1)
52-56: ⚡ Quick winUse field name from validator context for clearer error messages.
Line 56 passes the hardcoded string
"name"to_reject_explicit_null, which works correctly for this validator since it only applies to thenamefield. However, this pattern is brittle—if someone later adds more fields to the validator decorator, the error message will remain hardcoded as"name cannot be null". Usinginfo.field_namemakes the validator more maintainable and consistent with the pattern used elsewhere.♻️ Proposed refactor to use field name from context
- `@field_validator`("name") + `@field_validator`("name", mode="before") `@classmethod` - def validate_name(cls, value: str | None) -> str | None: + def validate_name(cls, value: str | None, info) -> str | None: """Normalize the optional replacement name and reject null.""" - return _normalize_name(_reject_explicit_null("name", value)) + return _normalize_name(_reject_explicit_null(info.field_name, value))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/api/schemas/admin/preset_schemas.py` around lines 52 - 56, The validator validate_name uses a hardcoded "name" when calling _reject_explicit_null which is brittle; change the validator to accept the pydantic FieldValidationInfo (e.g., add an info parameter) and pass info.field_name into _reject_explicit_null so the error message uses the actual field name (update validate_name signature to include info and call _normalize_name(_reject_explicit_null(info.field_name, value))). Ensure imports/types for the info param match the project’s pydantic version and keep the `@field_validator`("name") decorator as-is.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@openrag/api/schemas/admin/partition_schemas.py`:
- Around line 58-62: The validator validate_non_empty_name currently passes the
hardcoded string "name" to _reject_explicit_null, causing incorrect error
messages; update validate_non_empty_name to accept the validator info (e.g.,
change signature to def validate_non_empty_name(cls, value: str | None, info) ->
str | None) and call _reject_explicit_null(info.field_name, value) before
passing the result into _normalize_name so the error message identifies the
actual field (embedder, indexation_preset, or retrieval_preset).
---
Nitpick comments:
In `@openrag/api/schemas/admin/preset_schemas.py`:
- Around line 52-56: The validator validate_name uses a hardcoded "name" when
calling _reject_explicit_null which is brittle; change the validator to accept
the pydantic FieldValidationInfo (e.g., add an info parameter) and pass
info.field_name into _reject_explicit_null so the error message uses the actual
field name (update validate_name signature to include info and call
_normalize_name(_reject_explicit_null(info.field_name, value))). Ensure
imports/types for the info param match the project’s pydantic version and keep
the `@field_validator`("name") decorator as-is.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f7a5e4be-26cd-41c4-a7d4-741aa2584645
📒 Files selected for processing (15)
openrag/api/main.pyopenrag/api/routers/admin/model_endpoints.pyopenrag/api/routers/admin/partitions.pyopenrag/api/routers/admin/presets.pyopenrag/api/schemas/admin/__init__.pyopenrag/api/schemas/admin/model_endpoint_schemas.pyopenrag/api/schemas/admin/partition_schemas.pyopenrag/api/schemas/admin/preset_schemas.pyopenrag/di/providers.pytests/unit/api/routers/admin/test_phase14_admin_routers.pytests/unit/api/routers/admin/test_phase14_partition_routes.pytests/unit/api/routers/test_router_imports.pytests/unit/api/schemas/admin/test_phase14_schemas.pytests/unit/api/test_main_proxy_headers.pytests/unit/di/test_container.py
e596cac to
81fe53c
Compare
81fe53c to
c8f8b61
Compare
- `core/config/model_endpoints.py`: ModelEndpointConfig, ModelsConfig (frozen ConfigMixin with mutable dicts), ModelEndpointRow - `core/ports/model_endpoint_repo.py`: redesign port with 7 explicit methods (create, get, list_all, update, rename, delete, set_default); drop upsert - `services/persistence/model_endpoint_repo.py`: update stub to match new port - `tests/unit/core/config/test_model_endpoints.py`: frozen-field + mutable-dict invariant tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- `core/config/indexation_pipeline.py`: IndexationPipelineConfig (chunking, VLM, contextualization, entity extraction, topic tagging) - `core/config/retrieval_pipeline.py`: RetrievalPipelineConfig (flat per-partition, named endpoint refs) - `core/models/preset.py`: PresetRow, PartitionRow, PartitionConfig (runtime-resolved) - `core/config/presets.py`: PresetsConfig (frozen ConfigMixin with mutable dicts) - `core/config/root.py`: add presets + partitions fields to Settings - `tests/unit/core/config/test_pipeline_configs.py`: frozen-field + mutable-dict invariant tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
openrag/core/config/model_endpoints.py (2)
10-10: ⚡ Quick winSwitch this relative import to a root-based absolute import.
Line 10 should follow the same import style used across this PR (
core.*from repo root).Proposed change
-from .base import ConfigMixin +from core.config.base import ConfigMixinAs per coding guidelines:
**/*.py: Use absolute imports from theopenrag/directory as the Python path root, avoiding relative imports across packages.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/core/config/model_endpoints.py` at line 10, Replace the relative import in model_endpoints.py with a root-based absolute import: change the statement that currently imports ConfigMixin via ".base" to import it from "core.config.base" so the file uses the same absolute import style as the rest of the project (look for the import of ConfigMixin in model_endpoints.py and update it to reference core.config.base.ConfigMixin).
48-48: ⚡ Quick winHarden
ModelEndpointRowvalidation to match the repository/admin contract.
ModelEndpointRowuses unconstrainedmodel_type: strand defaultbatch_size/timeoutwithoutgt=0, unlikeModelEndpointConfigand the admin request schemas (which enforceModelEndpointTypeandgt=0). Even thoughPgModelEndpointRepositoryis currently stubbed, tighteningModelEndpointRowprevents contract drift when persistence is implemented.Proposed change
-from typing import Any +from typing import Any, Literal @@ class ModelEndpointRow(BaseModel): @@ - model_type: str # embedder | reranker | llm | vlm + model_type: Literal["embedder", "reranker", "llm", "vlm"] @@ - batch_size: int = 32 - timeout: float = 30.0 + batch_size: int = Field(default=32, gt=0) + timeout: float = Field(default=30.0, gt=0)Also applies to: 51-52
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/core/config/model_endpoints.py` at line 48, ModelEndpointRow currently declares model_type: str and unconstrained batch_size/timeout which diverges from the admin/repository contract; change ModelEndpointRow to use the ModelEndpointType enum for model_type (import ModelEndpointType) and tighten batch_size and timeout with Pydantic Field validators (gt=0) and the same defaults as ModelEndpointConfig so the row schema matches the admin request/ModelEndpointConfig contract and prevents future drift.openrag/core/config/indexation_pipeline.py (1)
12-12: ⚡ Quick winUse package-root absolute import instead of a relative import.
Line 12 should use a root-based import to match the repository import convention.
Proposed change
-from .chunking import ChunkerConfig +from core.config.chunking import ChunkerConfigAs per coding guidelines:
**/*.py: Use absolute imports from theopenrag/directory as the Python path root, avoiding relative imports across packages.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/core/config/indexation_pipeline.py` at line 12, Replace the relative import in indexation_pipeline by using a package-root absolute import: change the statement importing ChunkerConfig (currently "from .chunking import ChunkerConfig") to import from the repository root module path (e.g. "from openrag.core.config.chunking import ChunkerConfig") so it follows the project's absolute-import convention and matches other modules.tests/unit/core/config/test_model_endpoints.py (1)
10-29: ⚡ Quick winAdd contract tests for
ModelEndpointRowvalidation boundaries.This file currently validates
ModelsConfigmutability/frozen behavior only. Please add negative tests for invalidmodel_typeand non-positivebatch_size/timeoutso the new row contract is regression-protected.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/core/config/test_model_endpoints.py` around lines 10 - 29, Add tests to validate ModelEndpointRow contract boundaries by creating three negative test functions in the same test module: (1) test_model_endpoint_row_invalid_model_type: instantiate ModelEndpointRow with an invalid model_type (e.g., "not-a-model") and assert it raises ValidationError via pytest.raises; (2) test_model_endpoint_row_non_positive_batch_size: create ModelEndpointRow with batch_size set to 0 and another with a negative value and assert each raises ValidationError; (3) test_model_endpoint_row_non_positive_timeout: create ModelEndpointRow with timeout 0 and with a negative value and assert each raises ValidationError. Use the ModelEndpointRow constructor directly and import ValidationError/pytest as in the file, ensuring each test targets the model_type, batch_size, and timeout fields respectively.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@openrag/core/config/indexation_pipeline.py`:
- Line 21: IndexationPipelineConfig currently types parsing_strategy and
contextualization_mode as plain str which allows any value; change their
annotations to typing.Literal with the supported options (e.g.,
parsing_strategy: Literal["pymupdf", "marker", "docling"] = "marker" and
contextualization_mode: Literal[...supported values...] = <current_default>) and
import Literal from typing (or typing_extensions for older Python) so Pydantic
validation will reject invalid config values; keep the existing default values
and update any tests or callers expecting str if necessary.
In `@openrag/core/config/retrieval_pipeline.py`:
- Around line 18-26: The schema currently allows any string for the retrieval
mode and an unconstrained similarity_threshold; update the model so "type" is a
Literal restricted to "single", "multiQuery", or "hyde" (import typing.Literal
and replace the current type: str) and tighten similarity_threshold to a
validated Field with bounds (ge=0, le=1) so out-of-range values are rejected;
keep existing Field constraints for top_k/top_n and leave reranker/llm as
optional strings.
In `@openrag/core/models/preset.py`:
- Around line 17-18: Add invariant validation for persisted partition/preset
fields so invalid states cannot be loaded from the DB: enforce that preset_type
(in the Preset model) is one of the allowed values ("indexation" or
"retrieval"), that numeric config fields like chat_history_depth are
non-negative, and that dimension (or any vector dimension fields) is a positive
integer; implement these checks in the model validation/constructor for Preset
and the related Partition/Config classes referenced around the same file (lines
near the current preset_type/config definitions and the blocks at the other
noted ranges) so any invalid persisted record raises a clear validation error
instead of leaking into runtime.
---
Nitpick comments:
In `@openrag/core/config/indexation_pipeline.py`:
- Line 12: Replace the relative import in indexation_pipeline by using a
package-root absolute import: change the statement importing ChunkerConfig
(currently "from .chunking import ChunkerConfig") to import from the repository
root module path (e.g. "from openrag.core.config.chunking import ChunkerConfig")
so it follows the project's absolute-import convention and matches other
modules.
In `@openrag/core/config/model_endpoints.py`:
- Line 10: Replace the relative import in model_endpoints.py with a root-based
absolute import: change the statement that currently imports ConfigMixin via
".base" to import it from "core.config.base" so the file uses the same absolute
import style as the rest of the project (look for the import of ConfigMixin in
model_endpoints.py and update it to reference core.config.base.ConfigMixin).
- Line 48: ModelEndpointRow currently declares model_type: str and unconstrained
batch_size/timeout which diverges from the admin/repository contract; change
ModelEndpointRow to use the ModelEndpointType enum for model_type (import
ModelEndpointType) and tighten batch_size and timeout with Pydantic Field
validators (gt=0) and the same defaults as ModelEndpointConfig so the row schema
matches the admin request/ModelEndpointConfig contract and prevents future
drift.
In `@tests/unit/core/config/test_model_endpoints.py`:
- Around line 10-29: Add tests to validate ModelEndpointRow contract boundaries
by creating three negative test functions in the same test module: (1)
test_model_endpoint_row_invalid_model_type: instantiate ModelEndpointRow with an
invalid model_type (e.g., "not-a-model") and assert it raises ValidationError
via pytest.raises; (2) test_model_endpoint_row_non_positive_batch_size: create
ModelEndpointRow with batch_size set to 0 and another with a negative value and
assert each raises ValidationError; (3)
test_model_endpoint_row_non_positive_timeout: create ModelEndpointRow with
timeout 0 and with a negative value and assert each raises ValidationError. Use
the ModelEndpointRow constructor directly and import ValidationError/pytest as
in the file, ensuring each test targets the model_type, batch_size, and timeout
fields respectively.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3b346e97-366e-408c-97c8-a03236147482
📒 Files selected for processing (10)
openrag/core/config/indexation_pipeline.pyopenrag/core/config/model_endpoints.pyopenrag/core/config/presets.pyopenrag/core/config/retrieval_pipeline.pyopenrag/core/config/root.pyopenrag/core/models/preset.pyopenrag/core/ports/model_endpoint_repo.pyopenrag/services/persistence/model_endpoint_repo.pytests/unit/core/config/test_model_endpoints.pytests/unit/core/config/test_pipeline_configs.py
✅ Files skipped from review due to trivial changes (1)
- tests/unit/core/config/test_pipeline_configs.py
| model_config = ConfigDict(extra="ignore") | ||
|
|
||
| chunking: ChunkerConfig = Field(default_factory=ChunkerConfig) | ||
| parsing_strategy: str = "marker" # "pymupdf" | "marker" | "docling" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
ls
rg -n "class IndexationPipelineConfig|parsing_strategy|contextualization_mode" openrag/core/config/indexation_pipeline.py
nl -ba openrag/core/config/indexation_pipeline.py | sed -n '1,120p'Repository: linagora/openrag
Length of output: 614
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
import pathlib
p = pathlib.Path("openrag/core/config/indexation_pipeline.py")
text = p.read_text(encoding="utf-8").splitlines()
for i in [15,16,17,18,19,20,21,22,23,24,25,29,30,31,32,33,34,35]:
if 1 <= i <= len(text):
print(f"{i:4d}: {text[i-1]}")
PY
rg -n "parsing_strategy|contextualization_mode" openrag | head -n 200Repository: linagora/openrag
Length of output: 1089
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show rest of the config file (bounded)
python3 - <<'PY'
import pathlib
p = pathlib.Path("openrag/core/config/indexation_pipeline.py")
lines = p.read_text(encoding="utf-8").splitlines()
for start in [1,1,40,60,80]:
end = min(len(lines), start+120)
PY
# Print the whole file if it's small enough; otherwise print key sections
python3 - <<'PY'
import pathlib
p = pathlib.Path("openrag/core/config/indexation_pipeline.py")
lines = p.read_text(encoding="utf-8").splitlines()
print("lines:", len(lines))
PY
# Show sections that mention these fields
rg -n "parsing_strategy|contextualization_mode" openrag/core/config/indexation_pipeline.py
echo "---- context around parsing_strategy ----"
rg -n "parsing_strategy" -C 3 openrag/core/config/indexation_pipeline.py
echo "---- context around contextualization_mode ----"
rg -n "contextualization_mode" -C 3 openrag/core/config/indexation_pipeline.py
echo "---- find usages across openrag ----"
rg -n "parsing_strategy" openrag | head -n 200
rg -n "contextualization_mode" openrag | head -n 200Repository: linagora/openrag
Length of output: 1215
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "IndexationPipelineConfig" openrag | head -n 200
echo "---- search for parsing strategy keywords ----"
rg -n "\"pymupdf\"|\"marker\"|\"docling\"" openrag | head -n 200
rg -n "\bdocling\b|\bpymupdf\b|\bmarker\b" openrag | head -n 200
echo "---- search for contextualization mode keywords ----"
rg -n "\"none\"|\"simple\"|\"structured\"" openrag | head -n 200
rg -n "\bcontextualization_mode\b|\benable_contextualization\b|\bcontextualization\b" openrag | head -n 200Repository: linagora/openrag
Length of output: 7459
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "indexation\.parsing_strategy|parsing_strategy\s*=" openrag | head -n 200
rg -n "indexation\.contextualization_mode|contextualization_mode\s*=" openrag | head -n 200
rg -n "\.indexation\." openrag | head -n 200
rg -n "IndexationPipelineConfig\(" -S openrag | head -n 200
rg -n "preset\.indexation|indexation\s*:" openrag/core | head -n 200Repository: linagora/openrag
Length of output: 42
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find any mention of the config keys anywhere
rg -n "parsing_strategy" . | head -n 200
rg -n "contextualization_mode" . | head -n 200
# Look for dynamic/dict-style access patterns
rg -n "get\(['\"]parsing_strategy['\"]\)|\[['\"]parsing_strategy['\"]\]" . | head -n 200
rg -n "get\(['\"]contextualization_mode['\"]\)|\[['\"]contextualization_mode['\"]\]" . | head -n 200
# Look for any selection/dispatch based on these values by scanning for the value literals
rg -n "\"marker\"|\"pymupdf\"|\"docling\"" openrag | head -n 200
rg -n "\"none\"|\"simple\"|\"structured\"" openrag/core/indexing openrag/core/config openrag/core/prompts openrag/core/chunking 2>/dev/null | head -n 200Repository: linagora/openrag
Length of output: 731
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Search for direct attribute usage patterns
rg -n "\.parsing_strategy\b|parsing_strategy\s*\]" openrag | head -n 200
rg -n "\.contextualization_mode\b|contextualization_mode\s*\]" openrag | head -n 200
# 2) Search for dict-style access patterns in code
rg -n "(\.get\(|\[\s*['\"])" openrag | rg -n "parsing_strategy|contextualization_mode" | head -n 200
# 3) Search for dispatch based on parser registry / parser names
rg -n "parser_registry|register\(" openrag/core/indexing openrag/services openrag/core | head -n 200
rg -n "select.*parser|get_parser|create_.*parser|parser\s*=\s*|parser_name" openrag/core openrag/services | head -n 200
# 4) Search for the config object usage in the indexing pipeline code
rg -n "IndexationPipelineConfig|preset\.indexation|enable_contextualization" openrag | head -n 200Repository: linagora/openrag
Length of output: 42
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate runtime entry points for the indexation pipeline config
rg -n "IndexationPipelineConfig|indexation\.|preset\.indexation|enable_contextualization" openrag | head -n 200
# Show preset model definition
python3 - <<'PY'
import pathlib
p = pathlib.Path("openrag/core/models/preset.py")
lines = p.read_text(encoding="utf-8").splitlines()
for i,l in enumerate(lines, start=1):
if i<=120:
pass
print("lines:", len(lines))
for i in range(1,120):
if i<=len(lines):
print(f"{i:4d}: {lines[i-1]}")
PY
# Show config model again for context
python3 - <<'PY'
import pathlib
p = pathlib.Path("openrag/core/config/indexation_pipeline.py")
lines = p.read_text(encoding="utf-8").splitlines()
print("lines:", len(lines))
for i in range(1,len(lines)+1):
print(f"{i:4d}: {lines[i-1]}")
PY
# Find where presets are turned into an actual indexing job/pipeline
rg -n "indexation|indexing" openrag/core | head -n 200Repository: linagora/openrag
Length of output: 7196
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Print key modules that likely consume IndexationPipelineConfig
python3 - <<'PY'
import pathlib
paths = [
"openrag/core/indexing/dispatcher.py",
"openrag/core/indexing/validators.py",
"openrag/core/config/presets.py",
"openrag/core/config/root.py",
"openrag/core/config/indexation.py",
]
for path in paths:
p = pathlib.Path(path)
if not p.exists():
print(f"Missing: {path}")
continue
print("\n====", path, "====")
lines = p.read_text(encoding="utf-8").splitlines()
for i, l in enumerate(lines[:220], start=1):
print(f"{i:4d}: {l}")
if len(lines) > 220:
print(f"... ({len(lines)} lines total; truncated)")
PY
# Broader searches for camelCase / dynamic JSON keys that might be used by workers/clients
rg -n "parsingStrategy|contextualizationMode|contextualization_mode|parsing_strategy" openrag | head -n 200
rg -n "docling|pymupdf|marker" openrag/core/indexing openrag/services/workers | head -n 200
# Look for config keys being forwarded as dict/JSON
rg -n "model_dump\\(|dict\\(|json\\.|config\\s*\\[|get\\(|\\\"parsing_strategy\\\"|\\\"contextualization_mode\\\"" openrag/core openrag/services | head -n 200Repository: linagora/openrag
Length of output: 30339
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
from pathlib import Path
p = Path("tests/unit/core/config/test_pipeline_configs.py")
lines = p.read_text(encoding="utf-8").splitlines()
for i in range(1,80):
if i<=len(lines):
print(f"{i:4d}: {lines[i-1]}")
PY
echo "---- literal key usage ----"
rg -n "\"parsing_strategy\"|'parsing_strategy'|\"contextualization_mode\"|'contextualization_mode'" openrag tests docs | head -n 200
echo "---- search for contextualization dispatch ----"
rg -n "contextualize|contextualization" openrag/core/indexing openrag/core/chunking openrag/core/prompts | head -n 200
rg -n "structured|simple|none" openrag/core/indexing/contextualize.py openrag/core | head -n 200
echo "---- search for parser dispatch ----"
rg -n "parser_registry|register\(|Parser|parse\(" openrag/core/indexing/parsers | head -n 200
rg -n "pymupdf|marker|docling" openrag/core/indexing | head -n 200Repository: linagora/openrag
Length of output: 22956
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
from pathlib import Path
p = Path("openrag/core/config/indexation_pipeline.py")
for i,l in enumerate(p.read_text(encoding="utf-8").splitlines()[:40], start=1):
print(f"{i:4d}: {l}")
PYRepository: linagora/openrag
Length of output: 1651
Constrain enum-like indexation preset fields to supported values
IndexationPipelineConfig.parsing_strategy and IndexationPipelineConfig.contextualization_mode are currently typed as str, so Pydantic will accept any string during config deserialization. Constrain them with Literal[...] to fail fast on invalid config values.
Proposed change
+from typing import Literal
+
class IndexationPipelineConfig(BaseModel):
@@
- parsing_strategy: str = "marker" # "pymupdf" | "marker" | "docling"
+ parsing_strategy: Literal["pymupdf", "marker", "docling"] = "marker"
@@
- contextualization_mode: str = "none" # "none" | "simple" | "structured"
+ contextualization_mode: Literal["none", "simple", "structured"] = "none"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| parsing_strategy: str = "marker" # "pymupdf" | "marker" | "docling" | |
| from typing import Literal | |
| class IndexationPipelineConfig(BaseModel): | |
| parsing_strategy: Literal["pymupdf", "marker", "docling"] = "marker" | |
| contextualization_mode: Literal["none", "simple", "structured"] = "none" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@openrag/core/config/indexation_pipeline.py` at line 21,
IndexationPipelineConfig currently types parsing_strategy and
contextualization_mode as plain str which allows any value; change their
annotations to typing.Literal with the supported options (e.g.,
parsing_strategy: Literal["pymupdf", "marker", "docling"] = "marker" and
contextualization_mode: Literal[...supported values...] = <current_default>) and
import Literal from typing (or typing_extensions for older Python) so Pydantic
validation will reject invalid config values; keep the existing default values
and update any tests or callers expecting str if necessary.
| type: str = "single" # "single" | "multiQuery" | "hyde" | ||
| reranker: str | None = None # endpoint name; None = use default | ||
| llm: str | None = None # endpoint name for multiQuery / hyde expansion | ||
|
|
||
| top_k: int = Field(default=50, gt=0, le=1000) # vector candidates fetched | ||
| top_n: int = Field(default=10, gt=0, le=1000) # final results after reranking | ||
| enable_reranker: bool = True | ||
| similarity_threshold: float = 0.6 | ||
| include_related: bool = True |
There was a problem hiding this comment.
Constrain retrieval mode and threshold at schema level.
type is documented as a closed set but currently accepts any string, and similarity_threshold accepts out-of-range values. This lets invalid configs pass and fail later at runtime.
Suggested fix
+from typing import Literal
from pydantic import BaseModel, ConfigDict, Field
@@
- type: str = "single" # "single" | "multiQuery" | "hyde"
+ type: Literal["single", "multiQuery", "hyde"] = "single"
@@
- similarity_threshold: float = 0.6
+ similarity_threshold: float = Field(default=0.6, ge=0.0, le=1.0)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@openrag/core/config/retrieval_pipeline.py` around lines 18 - 26, The schema
currently allows any string for the retrieval mode and an unconstrained
similarity_threshold; update the model so "type" is a Literal restricted to
"single", "multiQuery", or "hyde" (import typing.Literal and replace the current
type: str) and tighten similarity_threshold to a validated Field with bounds
(ge=0, le=1) so out-of-range values are rejected; keep existing Field
constraints for top_k/top_n and leave reranker/llm as optional strings.
| preset_type: str # "indexation" | "retrieval" | ||
| config: dict[str, Any] |
There was a problem hiding this comment.
Add invariant validation on persisted partition/preset fields.
These models currently accept invalid states (e.g., unknown preset_type, negative chat_history_depth, non-positive dimension) that can leak from DB into runtime resolution.
Suggested fix
from datetime import datetime
-from typing import Any
+from typing import Any, Literal
@@
-from pydantic import BaseModel
+from pydantic import BaseModel, Field
@@
- preset_type: str # "indexation" | "retrieval"
+ preset_type: Literal["indexation", "retrieval"]
@@
- dimension: int = 1024
+ dimension: int = Field(default=1024, gt=0)
@@
- chat_history_depth: int = 0
+ chat_history_depth: int = Field(default=0, ge=0)
@@
- chat_history_depth: int = 0
+ chat_history_depth: int = Field(default=0, ge=0)Also applies to: 32-35, 53-54
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@openrag/core/models/preset.py` around lines 17 - 18, Add invariant validation
for persisted partition/preset fields so invalid states cannot be loaded from
the DB: enforce that preset_type (in the Preset model) is one of the allowed
values ("indexation" or "retrieval"), that numeric config fields like
chat_history_depth are non-negative, and that dimension (or any vector dimension
fields) is a positive integer; implement these checks in the model
validation/constructor for Preset and the related Partition/Config classes
referenced around the same file (lines near the current preset_type/config
definitions and the blocks at the other noted ranges) so any invalid persisted
record raises a clear validation error instead of leaking into runtime.
On low-resource machines the Milvus 2.6 gRPC handshake can exceed the default 10-second channel-ready wait, causing pymilvus to throw MilvusException (code=2, "Fail connecting to server … illegal connection params or server unavailable"). The underlying TCP port is open and the HTTP health endpoint responds OK — the channel just needs more time to complete the gRPC negotiation. - Pass `timeout=60` to both `MilvusClient` and `AsyncMilvusClient` at construction time in `MilvusVectorStore.__init__`. - Store the value in `self._timeout` so any future callers have a single source of truth. Root-cause confirmed by reproducing the error with the default timeout and a successful connect with timeout=60 against the same Milvus 2.6.11 container.
… migration - openrag/services/persistence/schema.py: add model_endpoints and pipeline_presets tables, extend partitions with 9 new columns, add indexation_config to files - openrag/services/persistence/migrations/alembic/versions/06dd2101ea3a_add_endpoints_presets_phase14.py: idempotent migration with composite PKs, JSONB defaults, check constraints Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…handler logs - Fix migration 06dd2101ea3a: `index_exists()` called with 1 arg instead of the required 2 (`table`, `index`), crashing Alembic at startup and causing all endpoints to return 500 (container degraded to None) - Add try/except in AuthMiddleware.dispatch() so a None container returns JSONResponse(503) instead of propagating AttributeError through ServerErrorMiddleware to a 500 - Add `message`, `method`, and `path` fields to both error handler log calls so failing requests are diagnosable without a full traceback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t/partition repos - PgModelEndpointRepository: 7 methods (create, get, list_all, update, rename, delete, set_default) with atomic transaction for set_default - PgPresetRepository: 5 methods (get, list_all, upsert, delete, count_partitions_using) with ON CONFLICT upsert and safe column dispatch - PgPartitionRepository: 3 new methods (get_partition_row, list_partition_rows, update_partition) + _row_to_full_dict helper - Extend PartitionRepository port with get_partition_row, list_partition_rows, update_partition abstract methods - Extend PresetRepository port with count_partitions_using abstract method - 18 new unit tests covering all new methods with fake pool/conn pattern Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
de28be1 to
80e28e7
Compare
- seed_defaults(): reads Settings + env-var overrides (EMBEDDER_ENDPOINT,
LLM_ENDPOINT, VLM_ENDPOINT, RERANKER_ENDPOINT and model counterparts),
inserts one is_default=True row per type when the DB is empty for that type
- load_all(): fetches all rows, rebuilds config.models.{embedder,llm,vlm,
reranker} dicts atomically via clear()+update(); injects virtual "default"
alias pointing to the is_default=True row per type
- create_model_endpoint(): validates model_type, raises 409 on duplicate
- update_model_endpoint(): updates fields and/or renames, evicts stale
client-cache entry, reloads config.models
- delete_model_endpoint(): guards last-endpoint-of-type, evicts cache
- set_default(): swaps is_default flag, evicts "default" cache alias
- validate_endpoint(): async HTTP probe of {url}/models, returns reachable +
model_found + models_served dict
- 17 unit tests covering all methods
- PresetService with seed_defaults(), load_all(), create_preset(), update_preset(), delete_preset(), get_preset(), list_presets() - 6 hardcoded default seeds: 3 indexation (default, legal, finance) and 3 retrieval (default, multiquery, hyde) - Pydantic validation via IndexationPipelineConfig / RetrievalPipelineConfig - Atomic in-place swap of config.presets.indexation / .retrieval dicts - Back-reference to PartitionService.load_partitions() on preset update - Referential integrity guard: raises 422 if partition(s) use the preset - 17 unit tests, all passing
- add resolve_partition_row() to map preset names to validated PartitionConfig - add load_partitions() with atomic clear()+update() swap into config.partitions - add seed_default_partition() for idempotent default-partition bootstrap - extend create_partition() with preset/config columns + pre-write validation - add update_partition() with merged-row validation and cache re-resolution - accept optional config in constructor; wire settings in DI container
feat(14G) injected config into the container's PartitionService, which made create_partition validate presets against config.presets at runtime. But nothing seeds that registry at startup yet (that is 14K), so every partition create returned [CONFIG_ERROR]: preset 'default' not found. Drop the container wiring so PartitionService runs with config=None — the legacy passthrough — until 14K seeds the registries on boot and 14J's per-partition retrieval path is complete. The resolution methods and their unit tests stay; they are exercised directly with a config.
f732c1f to
ade9c1d
Compare
Complete the Phase 14 composition-root wiring so the endpoint and preset registries are available during application startup. The container now seeds and loads model endpoints before presets, then resolves partition configs after both registries are populated. This closes the runtime gap where the Phase 14 admin API surface existed but service-backed routes could not use the real services. The partition service is also built with Settings again, so 14J can consume resolved per-partition configs once startup has loaded them.
f711db7 to
44d5c16
Compare
Context: Phase 14H defines the admin API contract for model endpoints, presets, and partition preset assignment while the service, repository, migration, and runtime-resolution work lands separately. This PR stays intentionally API-focused so reviewers can validate the route contracts before the data layer is complete.
Supersedes #442 and keeps the work on the upstream linagora/openrag branch.
Commit structure:
14H.0 - Model endpoint API
Adds the model endpoint request/response contract, admin router, and DI provider entry point needed by that router. This gives operators the future API surface for registering endpoints, validating them, and choosing defaults without tying the router to a concrete storage implementation yet.
14H.1 - Preset API schemas
Adds the preset request/response contract for indexation and retrieval presets. This keeps validation rules close to the API boundary before the preset service starts persisting and resolving those configs.
14H.2 - Preset admin router
Adds the preset management routes and focused router tests. This makes the preset API shape reviewable now, while real behavior remains delegated to the future service layer.
14H.3 - Partition preset schemas
Extends the admin schema contract so partitions can carry embedder, indexation preset, retrieval preset, chat depth, and chat LLM references. This also keeps the explicit-null validation behavior field-specific.
14H.4 - Partition config routes
Adds partition config update/read routes and tests. Missing service methods return explicit not-implemented responses until Person A's partition resolution work is merged.
14H.5 - Router registration
Registers the Phase 14 admin routers in the FastAPI application, adds OpenAPI tags, and verifies the stable mount points.
Validation:
Local validation passed after the six-commit rewrite: ruff, layer import guard, and the full unit suite. GitHub checks are expected to re-run after the force-with-lease push.
Remaining Phase 14 dependencies:
The database migration, repositories, endpoint service, preset service, partition resolution, default seeding, and full E2E behavior are still outside this PR and depend on the Person A/shared Phase 14 work.
Summary by CodeRabbit
New Features
Chores
Tests