refactor(phase 14): Model Endpoints Registry and Per-Partition Presets by hedhoud · Pull Request #444 · linagora/openrag

hedhoud · 2026-06-04T07:43:15Z

Context: Phase 14H defines the admin API contract for model endpoints, presets, and partition preset assignment while the service, repository, migration, and runtime-resolution work lands separately. This PR stays intentionally API-focused so reviewers can validate the route contracts before the data layer is complete.

Supersedes #442 and keeps the work on the upstream linagora/openrag branch.

Commit structure:

14H.0 - Model endpoint API
Adds the model endpoint request/response contract, admin router, and DI provider entry point needed by that router. This gives operators the future API surface for registering endpoints, validating them, and choosing defaults without tying the router to a concrete storage implementation yet.

14H.1 - Preset API schemas
Adds the preset request/response contract for indexation and retrieval presets. This keeps validation rules close to the API boundary before the preset service starts persisting and resolving those configs.

14H.2 - Preset admin router
Adds the preset management routes and focused router tests. This makes the preset API shape reviewable now, while real behavior remains delegated to the future service layer.

14H.3 - Partition preset schemas
Extends the admin schema contract so partitions can carry embedder, indexation preset, retrieval preset, chat depth, and chat LLM references. This also keeps the explicit-null validation behavior field-specific.

14H.4 - Partition config routes
Adds partition config update/read routes and tests. Missing service methods return explicit not-implemented responses until Person A's partition resolution work is merged.

14H.5 - Router registration
Registers the Phase 14 admin routers in the FastAPI application, adds OpenAPI tags, and verifies the stable mount points.

Validation:

Local validation passed after the six-commit rewrite: ruff, layer import guard, and the full unit suite. GitHub checks are expected to re-run after the force-with-lease push.

Remaining Phase 14 dependencies:

The database migration, repositories, endpoint service, preset service, partition resolution, default seeding, and full E2E behavior are still outside this PR and depend on the Person A/shared Phase 14 work.

Summary by CodeRabbit

New Features
- Admin UI/API: manage model endpoints (create/list/get/update/delete/validate/set-default), pipeline presets (CRUD + options), and partition configuration (get/update).
- OpenAPI: added "Model Endpoints" and "Presets" tags for clearer API docs.
- Configs: new per-partition indexation/retrieval and global models/presets/partition settings.
Chores
- Database migration adding tables for endpoints and presets; storage client timeout set.
Tests
- New unit tests for admin routers, schemas, partition routes, DI providers, configs, and mount assertions.

coderabbitai · 2026-06-04T07:43:27Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds Phase 14 admin APIs for model endpoint registry, pipeline presets, and partition config: new Pydantic schemas, admin-only routers delegating to DI services, optional DI getters with 503 fallback, main app mounts (/model-endpoints, /presets), updated ports/persistence stubs, migration/schema updates, and unit tests.

Changes

Phase 14 Admin APIs: Model Endpoints, Presets, and Partition Configuration

Layer / File(s)	Summary
Phase 14 Schema Definitions `openrag/api/schemas/admin/model_endpoint_schemas.py`, `openrag/api/schemas/admin/preset_schemas.py`, `openrag/api/schemas/admin/partition_schemas.py`, `openrag/api/schemas/admin/__init__.py`	Defines `ModelEndpointType` and `PresetType`, plus request/response DTOs and validators for model endpoints, presets, and partitions (normalization, non-empty checks, explicit null rejection, update-payload presence checks).
Admin Router Endpoints `openrag/api/routers/admin/model_endpoints.py`, `openrag/api/routers/admin/presets.py`, `openrag/api/routers/admin/partitions.py`	Adds admin-protected routers: model endpoints (create, list, get, update, delete, set-default, validate), presets (options, create, list, get, update, delete), and partition config (PATCH update, GET resolved config). Endpoints delegate to DI services and forward only provided update fields via `exclude_unset=True`.
DI Service Integration `openrag/di/providers.py`	Adds `_get_optional_service` helper and exported getters `get_model_endpoint_service` and `get_preset_service` that resolve from the initialized container or raise HTTP 503 when absent.
API App Wiring and OpenAPI Tags `openrag/api/main.py`	Mounts `model_endpoints_router` at `/model-endpoints` and `presets_router` at `/presets`, extends `Tags` enum with `MODEL_ENDPOINTS` and `PRESETS`, and updates docstrings for OpenAPI helpers and the Ray Serve wrapper.
Ports and Persistence Stubs `openrag/core/ports/model_endpoint_repo.py`, `openrag/services/persistence/model_endpoint_repo.py`	Revises model-endpoint repository interface to use `ModelEndpointRow` types and replaces `upsert` with explicit `create`, `update`, `rename`, and `set_default` method signatures; updates persistence stub signatures accordingly.
Core Configs and Domain Models `openrag/core/config/indexation_pipeline.py`, `openrag/core/config/retrieval_pipeline.py`, `openrag/core/config/model_endpoints.py`, `openrag/core/config/presets.py`, `openrag/core/models/preset.py`, `openrag/core/config/root.py`	Adds Indexation/Retrieval pipeline configs, ModelsConfig/ModelEndpointRow, PresetsConfig, Preset/Partition domain models, and composes them into root Settings with `models`, `presets`, and `partitions`.
Schema/DB Migration and Catalog `openrag/services/persistence/migrations/alembic/versions/06dd2101ea3a_add_endpoints_presets_phase14.py`, `openrag/services/persistence/schema.py`	Adds Alembic migration for `model_endpoints` and `pipeline_presets` tables, adds `files.indexation_config` and partition columns, introduces SQLAlchemy table objects `model_endpoints` and `pipeline_presets`, and consolidates metadata usage.
Comprehensive Test Coverage `tests/unit/api/schemas/admin/test_phase14_schemas.py`, `tests/unit/api/routers/admin/test_phase14_admin_routers.py`, `tests/unit/api/routers/admin/test_phase14_partition_routes.py`, `tests/unit/api/routers/test_router_imports.py`, `tests/unit/api/test_main_proxy_headers.py`, `tests/unit/di/test_container.py`, `tests/unit/core/config/*`	Adds tests for schema normalization/validation, router behavior (payload normalization, update-field forwarding, validation), partition config 501 fallback, DI optional-provider resolution/503 behavior, static mounting assertions, and frozen-field + mutable-dict invariants for configs. Test helpers and fake services included.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant ModelEndpointsRouter
  participant DIGetter as get_model_endpoint_service
  participant ModelEndpointService
  Client->>ModelEndpointsRouter: POST /model-endpoints (CreateModelEndpointRequest)
  ModelEndpointsRouter->>DIGetter: resolve service via DI
  DIGetter->>ModelEndpointService: create_model_endpoint(payload)
  ModelEndpointService-->>ModelEndpointsRouter: ModelEndpointResponse
  ModelEndpointsRouter-->>Client: 201 Created

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

linagora/openrag#333: Related work on port interfaces used by the new model-endpoint APIs.
linagora/openrag#431: DI provider-bridge changes connected to new optional service getters.
linagora/openrag#396: Earlier DI/provider wiring changes related to service accessors.

Suggested reviewers

Ahmath-Gadji

Poem

🐰 In admin fields the endpoints bloom,
Presets and partitions find their room,
Schemas trim and routers start to sing,
DI wires hum and tests take wing,
A rabbit cheers: "Phase 14 — let's groom!"

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 87.22% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The pull request title clearly identifies the main changes: it's about Phase 14 refactoring that adds a Model Endpoints Registry and Per-Partition Presets functionality. This accurately reflects the primary objectives of the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch refactor/phase14-B

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

openrag/api/schemas/admin/preset_schemas.py (1)
52-56: ⚡ Quick win

Use field name from validator context for clearer error messages.

Line 56 passes the hardcoded string "name" to _reject_explicit_null, which works correctly for this validator since it only applies to the name field. However, this pattern is brittle—if someone later adds more fields to the validator decorator, the error message will remain hardcoded as "name cannot be null". Using info.field_name makes the validator more maintainable and consistent with the pattern used elsewhere.
♻️ Proposed refactor to use field name from context
-    `@field_validator`("name")
+    `@field_validator`("name", mode="before")
     `@classmethod`
-    def validate_name(cls, value: str | None) -> str | None:
+    def validate_name(cls, value: str | None, info) -> str | None:
         """Normalize the optional replacement name and reject null."""
-        return _normalize_name(_reject_explicit_null("name", value))
+        return _normalize_name(_reject_explicit_null(info.field_name, value))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openrag/api/schemas/admin/preset_schemas.py` around lines 52 - 56, The
validator validate_name uses a hardcoded "name" when calling
_reject_explicit_null which is brittle; change the validator to accept the
pydantic FieldValidationInfo (e.g., add an info parameter) and pass
info.field_name into _reject_explicit_null so the error message uses the actual
field name (update validate_name signature to include info and call
_normalize_name(_reject_explicit_null(info.field_name, value))). Ensure
imports/types for the info param match the project’s pydantic version and keep
the `@field_validator`("name") decorator as-is.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openrag/api/schemas/admin/partition_schemas.py`:
- Around line 58-62: The validator validate_non_empty_name currently passes the
hardcoded string "name" to _reject_explicit_null, causing incorrect error
messages; update validate_non_empty_name to accept the validator info (e.g.,
change signature to def validate_non_empty_name(cls, value: str | None, info) ->
str | None) and call _reject_explicit_null(info.field_name, value) before
passing the result into _normalize_name so the error message identifies the
actual field (embedder, indexation_preset, or retrieval_preset).

---

Nitpick comments:
In `@openrag/api/schemas/admin/preset_schemas.py`:
- Around line 52-56: The validator validate_name uses a hardcoded "name" when
calling _reject_explicit_null which is brittle; change the validator to accept
the pydantic FieldValidationInfo (e.g., add an info parameter) and pass
info.field_name into _reject_explicit_null so the error message uses the actual
field name (update validate_name signature to include info and call
_normalize_name(_reject_explicit_null(info.field_name, value))). Ensure
imports/types for the info param match the project’s pydantic version and keep
the `@field_validator`("name") decorator as-is.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f7a5e4be-26cd-41c4-a7d4-741aa2584645

📥 Commits

Reviewing files that changed from the base of the PR and between e207d64 and e596cac.

📒 Files selected for processing (15)

openrag/api/main.py
openrag/api/routers/admin/model_endpoints.py
openrag/api/routers/admin/partitions.py
openrag/api/routers/admin/presets.py
openrag/api/schemas/admin/__init__.py
openrag/api/schemas/admin/model_endpoint_schemas.py
openrag/api/schemas/admin/partition_schemas.py
openrag/api/schemas/admin/preset_schemas.py
openrag/di/providers.py
tests/unit/api/routers/admin/test_phase14_admin_routers.py
tests/unit/api/routers/admin/test_phase14_partition_routes.py
tests/unit/api/routers/test_router_imports.py
tests/unit/api/schemas/admin/test_phase14_schemas.py
tests/unit/api/test_main_proxy_headers.py
tests/unit/di/test_container.py

- `core/config/model_endpoints.py`: ModelEndpointConfig, ModelsConfig (frozen ConfigMixin with mutable dicts), ModelEndpointRow - `core/ports/model_endpoint_repo.py`: redesign port with 7 explicit methods (create, get, list_all, update, rename, delete, set_default); drop upsert - `services/persistence/model_endpoint_repo.py`: update stub to match new port - `tests/unit/core/config/test_model_endpoints.py`: frozen-field + mutable-dict invariant tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- `core/config/indexation_pipeline.py`: IndexationPipelineConfig (chunking, VLM, contextualization, entity extraction, topic tagging) - `core/config/retrieval_pipeline.py`: RetrievalPipelineConfig (flat per-partition, named endpoint refs) - `core/models/preset.py`: PresetRow, PartitionRow, PartitionConfig (runtime-resolved) - `core/config/presets.py`: PresetsConfig (frozen ConfigMixin with mutable dicts) - `core/config/root.py`: add presets + partitions fields to Settings - `tests/unit/core/config/test_pipeline_configs.py`: frozen-field + mutable-dict invariant tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-06-04T10:16:29Z

Caution

Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted.

Error details

{}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (4)

openrag/core/config/model_endpoints.py (2)
10-10: ⚡ Quick win

Switch this relative import to a root-based absolute import.

Line 10 should follow the same import style used across this PR (core.* from repo root).
Proposed change
-from .base import ConfigMixin
+from core.config.base import ConfigMixin
As per coding guidelines: **/*.py: Use absolute imports from the openrag/ directory as the Python path root, avoiding relative imports across packages.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openrag/core/config/model_endpoints.py` at line 10, Replace the relative
import in model_endpoints.py with a root-based absolute import: change the
statement that currently imports ConfigMixin via ".base" to import it from
"core.config.base" so the file uses the same absolute import style as the rest
of the project (look for the import of ConfigMixin in model_endpoints.py and
update it to reference core.config.base.ConfigMixin).
48-48: ⚡ Quick win

Harden ModelEndpointRow validation to match the repository/admin contract.

ModelEndpointRow uses unconstrained model_type: str and default batch_size/timeout without gt=0, unlike ModelEndpointConfig and the admin request schemas (which enforce ModelEndpointType and gt=0). Even though PgModelEndpointRepository is currently stubbed, tightening ModelEndpointRow prevents contract drift when persistence is implemented.
Proposed change
-from typing import Any
+from typing import Any, Literal
@@
 class ModelEndpointRow(BaseModel):
@@
-    model_type: str  # embedder | reranker | llm | vlm
+    model_type: Literal["embedder", "reranker", "llm", "vlm"]
@@
-    batch_size: int = 32
-    timeout: float = 30.0
+    batch_size: int = Field(default=32, gt=0)
+    timeout: float = Field(default=30.0, gt=0)
Also applies to: 51-52
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openrag/core/config/model_endpoints.py` at line 48, ModelEndpointRow
currently declares model_type: str and unconstrained batch_size/timeout which
diverges from the admin/repository contract; change ModelEndpointRow to use the
ModelEndpointType enum for model_type (import ModelEndpointType) and tighten
batch_size and timeout with Pydantic Field validators (gt=0) and the same
defaults as ModelEndpointConfig so the row schema matches the admin
request/ModelEndpointConfig contract and prevents future drift.
openrag/core/config/indexation_pipeline.py (1)
12-12: ⚡ Quick win

Use package-root absolute import instead of a relative import.

Line 12 should use a root-based import to match the repository import convention.
Proposed change
-from .chunking import ChunkerConfig
+from core.config.chunking import ChunkerConfig
As per coding guidelines: **/*.py: Use absolute imports from the openrag/ directory as the Python path root, avoiding relative imports across packages.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openrag/core/config/indexation_pipeline.py` at line 12, Replace the relative
import in indexation_pipeline by using a package-root absolute import: change
the statement importing ChunkerConfig (currently "from .chunking import
ChunkerConfig") to import from the repository root module path (e.g. "from
openrag.core.config.chunking import ChunkerConfig") so it follows the project's
absolute-import convention and matches other modules.
tests/unit/core/config/test_model_endpoints.py (1)
10-29: ⚡ Quick win

Add contract tests for ModelEndpointRow validation boundaries.

This file currently validates ModelsConfig mutability/frozen behavior only. Please add negative tests for invalid model_type and non-positive batch_size/timeout so the new row contract is regression-protected.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/core/config/test_model_endpoints.py` around lines 10 - 29, Add
tests to validate ModelEndpointRow contract boundaries by creating three
negative test functions in the same test module: (1)
test_model_endpoint_row_invalid_model_type: instantiate ModelEndpointRow with an
invalid model_type (e.g., "not-a-model") and assert it raises ValidationError
via pytest.raises; (2) test_model_endpoint_row_non_positive_batch_size: create
ModelEndpointRow with batch_size set to 0 and another with a negative value and
assert each raises ValidationError; (3)
test_model_endpoint_row_non_positive_timeout: create ModelEndpointRow with
timeout 0 and with a negative value and assert each raises ValidationError. Use
the ModelEndpointRow constructor directly and import ValidationError/pytest as
in the file, ensuring each test targets the model_type, batch_size, and timeout
fields respectively.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openrag/core/config/indexation_pipeline.py`:
- Line 21: IndexationPipelineConfig currently types parsing_strategy and
contextualization_mode as plain str which allows any value; change their
annotations to typing.Literal with the supported options (e.g.,
parsing_strategy: Literal["pymupdf", "marker", "docling"] = "marker" and
contextualization_mode: Literal[...supported values...] = <current_default>) and
import Literal from typing (or typing_extensions for older Python) so Pydantic
validation will reject invalid config values; keep the existing default values
and update any tests or callers expecting str if necessary.

In `@openrag/core/config/retrieval_pipeline.py`:
- Around line 18-26: The schema currently allows any string for the retrieval
mode and an unconstrained similarity_threshold; update the model so "type" is a
Literal restricted to "single", "multiQuery", or "hyde" (import typing.Literal
and replace the current type: str) and tighten similarity_threshold to a
validated Field with bounds (ge=0, le=1) so out-of-range values are rejected;
keep existing Field constraints for top_k/top_n and leave reranker/llm as
optional strings.

In `@openrag/core/models/preset.py`:
- Around line 17-18: Add invariant validation for persisted partition/preset
fields so invalid states cannot be loaded from the DB: enforce that preset_type
(in the Preset model) is one of the allowed values ("indexation" or
"retrieval"), that numeric config fields like chat_history_depth are
non-negative, and that dimension (or any vector dimension fields) is a positive
integer; implement these checks in the model validation/constructor for Preset
and the related Partition/Config classes referenced around the same file (lines
near the current preset_type/config definitions and the blocks at the other
noted ranges) so any invalid persisted record raises a clear validation error
instead of leaking into runtime.

---

Nitpick comments:
In `@openrag/core/config/indexation_pipeline.py`:
- Line 12: Replace the relative import in indexation_pipeline by using a
package-root absolute import: change the statement importing ChunkerConfig
(currently "from .chunking import ChunkerConfig") to import from the repository
root module path (e.g. "from openrag.core.config.chunking import ChunkerConfig")
so it follows the project's absolute-import convention and matches other
modules.

In `@openrag/core/config/model_endpoints.py`:
- Line 10: Replace the relative import in model_endpoints.py with a root-based
absolute import: change the statement that currently imports ConfigMixin via
".base" to import it from "core.config.base" so the file uses the same absolute
import style as the rest of the project (look for the import of ConfigMixin in
model_endpoints.py and update it to reference core.config.base.ConfigMixin).
- Line 48: ModelEndpointRow currently declares model_type: str and unconstrained
batch_size/timeout which diverges from the admin/repository contract; change
ModelEndpointRow to use the ModelEndpointType enum for model_type (import
ModelEndpointType) and tighten batch_size and timeout with Pydantic Field
validators (gt=0) and the same defaults as ModelEndpointConfig so the row schema
matches the admin request/ModelEndpointConfig contract and prevents future
drift.

In `@tests/unit/core/config/test_model_endpoints.py`:
- Around line 10-29: Add tests to validate ModelEndpointRow contract boundaries
by creating three negative test functions in the same test module: (1)
test_model_endpoint_row_invalid_model_type: instantiate ModelEndpointRow with an
invalid model_type (e.g., "not-a-model") and assert it raises ValidationError
via pytest.raises; (2) test_model_endpoint_row_non_positive_batch_size: create
ModelEndpointRow with batch_size set to 0 and another with a negative value and
assert each raises ValidationError; (3)
test_model_endpoint_row_non_positive_timeout: create ModelEndpointRow with
timeout 0 and with a negative value and assert each raises ValidationError. Use
the ModelEndpointRow constructor directly and import ValidationError/pytest as
in the file, ensuring each test targets the model_type, batch_size, and timeout
fields respectively.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3b346e97-366e-408c-97c8-a03236147482

📥 Commits

Reviewing files that changed from the base of the PR and between c8f8b61 and 7ad7b10.

📒 Files selected for processing (10)

openrag/core/config/indexation_pipeline.py
openrag/core/config/model_endpoints.py
openrag/core/config/presets.py
openrag/core/config/retrieval_pipeline.py
openrag/core/config/root.py
openrag/core/models/preset.py
openrag/core/ports/model_endpoint_repo.py
openrag/services/persistence/model_endpoint_repo.py
tests/unit/core/config/test_model_endpoints.py
tests/unit/core/config/test_pipeline_configs.py

✅ Files skipped from review due to trivial changes (1)

tests/unit/core/config/test_pipeline_configs.py

coderabbitai · 2026-06-04T10:22:33Z

+    model_config = ConfigDict(extra="ignore")
+
+    chunking: ChunkerConfig = Field(default_factory=ChunkerConfig)
+    parsing_strategy: str = "marker"  # "pymupdf" | "marker" | "docling"


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail ls rg -n "class IndexationPipelineConfig|parsing_strategy|contextualization_mode" openrag/core/config/indexation_pipeline.py nl -ba openrag/core/config/indexation_pipeline.py | sed -n '1,120p'

Repository: linagora/openrag

Length of output: 614

🏁 Script executed:

#!/bin/bash set -euo pipefail python3 - <<'PY' import pathlib p = pathlib.Path("openrag/core/config/indexation_pipeline.py") text = p.read_text(encoding="utf-8").splitlines() for i in [15,16,17,18,19,20,21,22,23,24,25,29,30,31,32,33,34,35]: if 1 <= i <= len(text): print(f"{i:4d}: {text[i-1]}") PY rg -n "parsing_strategy|contextualization_mode" openrag | head -n 200

Repository: linagora/openrag

Length of output: 1089

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show rest of the config file (bounded) python3 - <<'PY' import pathlib p = pathlib.Path("openrag/core/config/indexation_pipeline.py") lines = p.read_text(encoding="utf-8").splitlines() for start in [1,1,40,60,80]: end = min(len(lines), start+120) PY # Print the whole file if it's small enough; otherwise print key sections python3 - <<'PY' import pathlib p = pathlib.Path("openrag/core/config/indexation_pipeline.py") lines = p.read_text(encoding="utf-8").splitlines() print("lines:", len(lines)) PY # Show sections that mention these fields rg -n "parsing_strategy|contextualization_mode" openrag/core/config/indexation_pipeline.py echo "---- context around parsing_strategy ----" rg -n "parsing_strategy" -C 3 openrag/core/config/indexation_pipeline.py echo "---- context around contextualization_mode ----" rg -n "contextualization_mode" -C 3 openrag/core/config/indexation_pipeline.py echo "---- find usages across openrag ----" rg -n "parsing_strategy" openrag | head -n 200 rg -n "contextualization_mode" openrag | head -n 200

Repository: linagora/openrag

Length of output: 1215

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "IndexationPipelineConfig" openrag | head -n 200 echo "---- search for parsing strategy keywords ----" rg -n "\"pymupdf\"|\"marker\"|\"docling\"" openrag | head -n 200 rg -n "\bdocling\b|\bpymupdf\b|\bmarker\b" openrag | head -n 200 echo "---- search for contextualization mode keywords ----" rg -n "\"none\"|\"simple\"|\"structured\"" openrag | head -n 200 rg -n "\bcontextualization_mode\b|\benable_contextualization\b|\bcontextualization\b" openrag | head -n 200

Repository: linagora/openrag

Length of output: 7459

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "indexation\.parsing_strategy|parsing_strategy\s*=" openrag | head -n 200 rg -n "indexation\.contextualization_mode|contextualization_mode\s*=" openrag | head -n 200 rg -n "\.indexation\." openrag | head -n 200 rg -n "IndexationPipelineConfig\(" -S openrag | head -n 200 rg -n "preset\.indexation|indexation\s*:" openrag/core | head -n 200

Repository: linagora/openrag

Length of output: 42

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find any mention of the config keys anywhere rg -n "parsing_strategy" . | head -n 200 rg -n "contextualization_mode" . | head -n 200 # Look for dynamic/dict-style access patterns rg -n "get\(['\"]parsing_strategy['\"]\)|\[['\"]parsing_strategy['\"]\]" . | head -n 200 rg -n "get\(['\"]contextualization_mode['\"]\)|\[['\"]contextualization_mode['\"]\]" . | head -n 200 # Look for any selection/dispatch based on these values by scanning for the value literals rg -n "\"marker\"|\"pymupdf\"|\"docling\"" openrag | head -n 200 rg -n "\"none\"|\"simple\"|\"structured\"" openrag/core/indexing openrag/core/config openrag/core/prompts openrag/core/chunking 2>/dev/null | head -n 200

Repository: linagora/openrag

Length of output: 731

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Search for direct attribute usage patterns rg -n "\.parsing_strategy\b|parsing_strategy\s*\]" openrag | head -n 200 rg -n "\.contextualization_mode\b|contextualization_mode\s*\]" openrag | head -n 200 # 2) Search for dict-style access patterns in code rg -n "(\.get\(|\[\s*['\"])" openrag | rg -n "parsing_strategy|contextualization_mode" | head -n 200 # 3) Search for dispatch based on parser registry / parser names rg -n "parser_registry|register\(" openrag/core/indexing openrag/services openrag/core | head -n 200 rg -n "select.*parser|get_parser|create_.*parser|parser\s*=\s*|parser_name" openrag/core openrag/services | head -n 200 # 4) Search for the config object usage in the indexing pipeline code rg -n "IndexationPipelineConfig|preset\.indexation|enable_contextualization" openrag | head -n 200

Repository: linagora/openrag

Length of output: 42

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate runtime entry points for the indexation pipeline config rg -n "IndexationPipelineConfig|indexation\.|preset\.indexation|enable_contextualization" openrag | head -n 200 # Show preset model definition python3 - <<'PY' import pathlib p = pathlib.Path("openrag/core/models/preset.py") lines = p.read_text(encoding="utf-8").splitlines() for i,l in enumerate(lines, start=1): if i<=120: pass print("lines:", len(lines)) for i in range(1,120): if i<=len(lines): print(f"{i:4d}: {lines[i-1]}") PY # Show config model again for context python3 - <<'PY' import pathlib p = pathlib.Path("openrag/core/config/indexation_pipeline.py") lines = p.read_text(encoding="utf-8").splitlines() print("lines:", len(lines)) for i in range(1,len(lines)+1): print(f"{i:4d}: {lines[i-1]}") PY # Find where presets are turned into an actual indexing job/pipeline rg -n "indexation|indexing" openrag/core | head -n 200

Repository: linagora/openrag

Length of output: 7196

🏁 Script executed:

#!/bin/bash set -euo pipefail # Print key modules that likely consume IndexationPipelineConfig python3 - <<'PY' import pathlib paths = [ "openrag/core/indexing/dispatcher.py", "openrag/core/indexing/validators.py", "openrag/core/config/presets.py", "openrag/core/config/root.py", "openrag/core/config/indexation.py", ] for path in paths: p = pathlib.Path(path) if not p.exists(): print(f"Missing: {path}") continue print("\n====", path, "====") lines = p.read_text(encoding="utf-8").splitlines() for i, l in enumerate(lines[:220], start=1): print(f"{i:4d}: {l}") if len(lines) > 220: print(f"... ({len(lines)} lines total; truncated)") PY # Broader searches for camelCase / dynamic JSON keys that might be used by workers/clients rg -n "parsingStrategy|contextualizationMode|contextualization_mode|parsing_strategy" openrag | head -n 200 rg -n "docling|pymupdf|marker" openrag/core/indexing openrag/services/workers | head -n 200 # Look for config keys being forwarded as dict/JSON rg -n "model_dump\\(|dict\\(|json\\.|config\\s*\\[|get\\(|\\\"parsing_strategy\\\"|\\\"contextualization_mode\\\"" openrag/core openrag/services | head -n 200

Repository: linagora/openrag

Length of output: 30339

🏁 Script executed:

#!/bin/bash set -euo pipefail python3 - <<'PY' from pathlib import Path p = Path("tests/unit/core/config/test_pipeline_configs.py") lines = p.read_text(encoding="utf-8").splitlines() for i in range(1,80): if i<=len(lines): print(f"{i:4d}: {lines[i-1]}") PY echo "---- literal key usage ----" rg -n "\"parsing_strategy\"|'parsing_strategy'|\"contextualization_mode\"|'contextualization_mode'" openrag tests docs | head -n 200 echo "---- search for contextualization dispatch ----" rg -n "contextualize|contextualization" openrag/core/indexing openrag/core/chunking openrag/core/prompts | head -n 200 rg -n "structured|simple|none" openrag/core/indexing/contextualize.py openrag/core | head -n 200 echo "---- search for parser dispatch ----" rg -n "parser_registry|register\(|Parser|parse\(" openrag/core/indexing/parsers | head -n 200 rg -n "pymupdf|marker|docling" openrag/core/indexing | head -n 200

Repository: linagora/openrag

Length of output: 22956

🏁 Script executed:

#!/bin/bash set -euo pipefail python3 - <<'PY' from pathlib import Path p = Path("openrag/core/config/indexation_pipeline.py") for i,l in enumerate(p.read_text(encoding="utf-8").splitlines()[:40], start=1): print(f"{i:4d}: {l}") PY

Repository: linagora/openrag

Length of output: 1651

Constrain enum-like indexation preset fields to supported values

IndexationPipelineConfig.parsing_strategy and IndexationPipelineConfig.contextualization_mode are currently typed as str, so Pydantic will accept any string during config deserialization. Constrain them with Literal[...] to fail fast on invalid config values.

Proposed change

+from typing import Literal + class IndexationPipelineConfig(BaseModel): @@ - parsing_strategy: str = "marker" # "pymupdf" | "marker" | "docling" + parsing_strategy: Literal["pymupdf", "marker", "docling"] = "marker" @@ - contextualization_mode: str = "none" # "none" | "simple" | "structured" + contextualization_mode: Literal["none", "simple", "structured"] = "none"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

parsing_strategy: str = "marker" # "pymupdf" | "marker" | "docling"

from typing import Literal

class IndexationPipelineConfig(BaseModel):

parsing_strategy: Literal["pymupdf", "marker", "docling"] = "marker"

contextualization_mode: Literal["none", "simple", "structured"] = "none"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/core/config/indexation_pipeline.py` at line 21, IndexationPipelineConfig currently types parsing_strategy and contextualization_mode as plain str which allows any value; change their annotations to typing.Literal with the supported options (e.g., parsing_strategy: Literal["pymupdf", "marker", "docling"] = "marker" and contextualization_mode: Literal[...supported values...] = <current_default>) and import Literal from typing (or typing_extensions for older Python) so Pydantic validation will reject invalid config values; keep the existing default values and update any tests or callers expecting str if necessary.

coderabbitai · 2026-06-04T10:22:33Z

+    type: str = "single"  # "single" | "multiQuery" | "hyde"
+    reranker: str | None = None  # endpoint name; None = use default
+    llm: str | None = None  # endpoint name for multiQuery / hyde expansion
+
+    top_k: int = Field(default=50, gt=0, le=1000)  # vector candidates fetched
+    top_n: int = Field(default=10, gt=0, le=1000)  # final results after reranking
+    enable_reranker: bool = True
+    similarity_threshold: float = 0.6
+    include_related: bool = True


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Constrain retrieval mode and threshold at schema level.

type is documented as a closed set but currently accepts any string, and similarity_threshold accepts out-of-range values. This lets invalid configs pass and fail later at runtime.

Suggested fix

+from typing import Literal from pydantic import BaseModel, ConfigDict, Field @@ - type: str = "single" # "single" | "multiQuery" | "hyde" + type: Literal["single", "multiQuery", "hyde"] = "single" @@ - similarity_threshold: float = 0.6 + similarity_threshold: float = Field(default=0.6, ge=0.0, le=1.0)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/core/config/retrieval_pipeline.py` around lines 18 - 26, The schema currently allows any string for the retrieval mode and an unconstrained similarity_threshold; update the model so "type" is a Literal restricted to "single", "multiQuery", or "hyde" (import typing.Literal and replace the current type: str) and tighten similarity_threshold to a validated Field with bounds (ge=0, le=1) so out-of-range values are rejected; keep existing Field constraints for top_k/top_n and leave reranker/llm as optional strings.

coderabbitai · 2026-06-04T10:22:33Z

+    preset_type: str  # "indexation" | "retrieval"
+    config: dict[str, Any]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add invariant validation on persisted partition/preset fields.

These models currently accept invalid states (e.g., unknown preset_type, negative chat_history_depth, non-positive dimension) that can leak from DB into runtime resolution.

Suggested fix

from datetime import datetime -from typing import Any +from typing import Any, Literal @@ -from pydantic import BaseModel +from pydantic import BaseModel, Field @@ - preset_type: str # "indexation" | "retrieval" + preset_type: Literal["indexation", "retrieval"] @@ - dimension: int = 1024 + dimension: int = Field(default=1024, gt=0) @@ - chat_history_depth: int = 0 + chat_history_depth: int = Field(default=0, ge=0) @@ - chat_history_depth: int = 0 + chat_history_depth: int = Field(default=0, ge=0)

Also applies to: 32-35, 53-54

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/core/models/preset.py` around lines 17 - 18, Add invariant validation for persisted partition/preset fields so invalid states cannot be loaded from the DB: enforce that preset_type (in the Preset model) is one of the allowed values ("indexation" or "retrieval"), that numeric config fields like chat_history_depth are non-negative, and that dimension (or any vector dimension fields) is a positive integer; implement these checks in the model validation/constructor for Preset and the related Partition/Config classes referenced around the same file (lines near the current preset_type/config definitions and the blocks at the other noted ranges) so any invalid persisted record raises a clear validation error instead of leaking into runtime.

On low-resource machines the Milvus 2.6 gRPC handshake can exceed the default 10-second channel-ready wait, causing pymilvus to throw MilvusException (code=2, "Fail connecting to server … illegal connection params or server unavailable"). The underlying TCP port is open and the HTTP health endpoint responds OK — the channel just needs more time to complete the gRPC negotiation. - Pass `timeout=60` to both `MilvusClient` and `AsyncMilvusClient` at construction time in `MilvusVectorStore.__init__`. - Store the value in `self._timeout` so any future callers have a single source of truth. Root-cause confirmed by reproducing the error with the default timeout and a successful connect with timeout=60 against the same Milvus 2.6.11 container.

… migration - openrag/services/persistence/schema.py: add model_endpoints and pipeline_presets tables, extend partitions with 9 new columns, add indexation_config to files - openrag/services/persistence/migrations/alembic/versions/06dd2101ea3a_add_endpoints_presets_phase14.py: idempotent migration with composite PKs, JSONB defaults, check constraints Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…handler logs - Fix migration 06dd2101ea3a: `index_exists()` called with 1 arg instead of the required 2 (`table`, `index`), crashing Alembic at startup and causing all endpoints to return 500 (container degraded to None) - Add try/except in AuthMiddleware.dispatch() so a None container returns JSONResponse(503) instead of propagating AttributeError through ServerErrorMiddleware to a 500 - Add `message`, `method`, and `path` fields to both error handler log calls so failing requests are diagnosable without a full traceback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…t/partition repos - PgModelEndpointRepository: 7 methods (create, get, list_all, update, rename, delete, set_default) with atomic transaction for set_default - PgPresetRepository: 5 methods (get, list_all, upsert, delete, count_partitions_using) with ON CONFLICT upsert and safe column dispatch - PgPartitionRepository: 3 new methods (get_partition_row, list_partition_rows, update_partition) + _row_to_full_dict helper - Extend PartitionRepository port with get_partition_row, list_partition_rows, update_partition abstract methods - Extend PresetRepository port with count_partitions_using abstract method - 18 new unit tests covering all new methods with fake pool/conn pattern Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- seed_defaults(): reads Settings + env-var overrides (EMBEDDER_ENDPOINT, LLM_ENDPOINT, VLM_ENDPOINT, RERANKER_ENDPOINT and model counterparts), inserts one is_default=True row per type when the DB is empty for that type - load_all(): fetches all rows, rebuilds config.models.{embedder,llm,vlm, reranker} dicts atomically via clear()+update(); injects virtual "default" alias pointing to the is_default=True row per type - create_model_endpoint(): validates model_type, raises 409 on duplicate - update_model_endpoint(): updates fields and/or renames, evicts stale client-cache entry, reloads config.models - delete_model_endpoint(): guards last-endpoint-of-type, evicts cache - set_default(): swaps is_default flag, evicts "default" cache alias - validate_endpoint(): async HTTP probe of {url}/models, returns reachable + model_found + models_served dict - 17 unit tests covering all methods

- PresetService with seed_defaults(), load_all(), create_preset(), update_preset(), delete_preset(), get_preset(), list_presets() - 6 hardcoded default seeds: 3 indexation (default, legal, finance) and 3 retrieval (default, multiquery, hyde) - Pydantic validation via IndexationPipelineConfig / RetrievalPipelineConfig - Atomic in-place swap of config.presets.indexation / .retrieval dicts - Back-reference to PartitionService.load_partitions() on preset update - Referential integrity guard: raises 422 if partition(s) use the preset - 17 unit tests, all passing

- add resolve_partition_row() to map preset names to validated PartitionConfig - add load_partitions() with atomic clear()+update() swap into config.partitions - add seed_default_partition() for idempotent default-partition bootstrap - extend create_partition() with preset/config columns + pre-write validation - add update_partition() with merged-row validation and cache re-resolution - accept optional config in constructor; wire settings in DI container

feat(14G) injected config into the container's PartitionService, which made create_partition validate presets against config.presets at runtime. But nothing seeds that registry at startup yet (that is 14K), so every partition create returned [CONFIG_ERROR]: preset 'default' not found. Drop the container wiring so PartitionService runs with config=None — the legacy passthrough — until 14K seeds the registries on boot and 14J's per-partition retrieval path is complete. The resolution methods and their unit tests stay; they are exercised directly with a config.

Complete the Phase 14 composition-root wiring so the endpoint and preset registries are available during application startup. The container now seeds and loads model endpoints before presets, then resolves partition configs after both registries are populated. This closes the runtime gap where the Phase 14 admin API surface existed but service-backed routes could not use the real services. The partition service is also built with Settings again, so 14J can consume resolved per-partition configs once startup has loaded them.

coderabbitai Bot added feat Add a new feature refactor labels Jun 4, 2026

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread openrag/api/schemas/admin/partition_schemas.py Outdated

hedhoud changed the title ~~Add Phase 14 admin API layer~~ refactor(phase 14H): add admin APIs for model endpoints, presets, and partition config Jun 4, 2026

hedhoud force-pushed the refactor/phase14-B branch from e596cac to 81fe53c Compare June 4, 2026 09:33

hedhoud added 6 commits June 4, 2026 11:46

refactor(phase 14H.0): add model endpoint schemas and admin router

1c40e47

refactor(phase 14H.1): add preset API schemas

991288b

refactor(phase 14H.2): add preset management admin router

13caed4

refactor(phase 14H.3): extend partition schemas with preset support

2e0bb0c

refactor(phase 14H.4): add partition config admin routes

a3d547d

refactor(phase 14H.5): register phase 14 admin routers

c8f8b61

hedhoud force-pushed the refactor/phase14-B branch from 81fe53c to c8f8b61 Compare June 4, 2026 09:48

Ahmath-Gadji and others added 2 commits June 4, 2026 12:06

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

Ahmath-Gadji and others added 2 commits June 4, 2026 12:43

Ahmath-Gadji mentioned this pull request Jun 4, 2026

fix(milvus): raise gRPC channel-ready timeout to 60 s #443

Closed

2 tasks

hedhoud changed the title ~~refactor(phase 14H): add admin APIs for model endpoints, presets, and partition config~~ refactor(phase 14): Model Endpoints Registry and Per-Partition Presets Jun 4, 2026

Ahmath-Gadji and others added 5 commits June 4, 2026 14:42

refactor(14I): wire named component factories

c4417de

refactor(14J): resolve pipelines from partition config

97545f2

style(14J): format pipeline resolution changes

8792945

Ahmath-Gadji force-pushed the refactor/phase14-B branch 2 times, most recently from de28be1 to 80e28e7 Compare June 5, 2026 10:09

Ahmath-Gadji added 2 commits June 5, 2026 13:52

Ahmath-Gadji force-pushed the refactor/phase14-B branch from f732c1f to ade9c1d Compare June 5, 2026 11:58

hedhoud force-pushed the refactor/phase14-B branch from f711db7 to 44d5c16 Compare June 5, 2026 13:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(phase 14): Model Endpoints Registry and Per-Partition Presets#444

refactor(phase 14): Model Endpoints Registry and Per-Partition Presets#444
hedhoud wants to merge 20 commits into
refactor/hexagonalfrom
refactor/phase14-B

hedhoud commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 4, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Uh oh!

coderabbitai Bot Jun 4, 2026

Uh oh!

coderabbitai Bot Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    parsing_strategy: str = "marker"  # "pymupdf" | "marker" | "docling"
+from typing import Literal
+class IndexationPipelineConfig(BaseModel):
+    parsing_strategy: Literal["pymupdf", "marker", "docling"] = "marker"
+    contextualization_mode: Literal["none", "simple", "structured"] = "none"

		preset_type: str # "indexation" \| "retrieval"
		config: dict[str, Any]

Conversation

hedhoud commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 4, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hedhoud commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading