Refactor/hexagonal#441
Conversation
Previously the loop body used '=' so each iteration overwrote 's', returning only the final page. Switch to '+=' so all pages are concatenated as intended. Fixes #376
doc_metadata.get('source') returns None for older partitions, web-search
results, or chunks from loaders that don't set the field. Path(None)
raises TypeError, which surfaces as a 500 and takes down the entire
chat completion response. Default the missing key to an empty string
before constructing the Path.
Fixes #370
is_bypass_path used path.startswith('/chainlit'), which has no boundary,
so /chainlitevil, /chainlit-spoof, and any future route sharing the
prefix bypassed authentication. Require an exact match or a trailing
slash so only the actual Chainlit subtree skips auth.
Fixes #359
_BaseWebSearchConfig.fetch_verify_ssl defaulted to False, which made ContentFetcher pass verify=False to httpx.AsyncClient out of the box. Every server-side fetch of a web search result skipped TLS verification, leaving citation contents open to silent man-in-the-middle tampering. Flip the default to True. Operators who legitimately need to allow self-signed hosts can still opt out via the existing config knob. Fixes #382
The verifier only checked 'client_id in aud' when aud was a JSON array. Per OIDC Core 1.0 §3.1.3.7, a token whose aud lists multiple audiences must also contain an azp claim equal to the RP's client_id. Without that check, a token issued by the same IdP for a sibling client that happens to include this client in its audience list was accepted, letting one tenant's tokens authenticate against another tenant's sessions. Fixes #385
When OIDC_CLAIM_SOURCE=userinfo, _apply_claim_mapping fetched claims from /userinfo and fed them straight into the user-update write path. OIDC Core 1.0 §5.3.2 requires the RP to verify userinfo.sub == id_token.sub before trusting any userinfo claim. Without that check, a malformed userinfo response or a token-substitution attack at the userinfo endpoint could rewrite the matched user's display_name and email on every login. Fixes #386
_ensure_admin_user / UserRepository.ensure_admin_user generated a fresh
'or-' token whenever AUTH_TOKEN was unset and unconditionally wrote it
to the existing admin row. Every container restart therefore overwrote
the value previously stored, including any token issued via
POST /users/{id}/regenerate_token for user id 1, silently invalidating
all admin clients and CI integrations with no audit trail.
Now:
- A brand-new admin row still gets a freshly generated token if no
AUTH_TOKEN is provided.
- If AUTH_TOKEN is set, it's written to the existing row to keep DB
state in sync.
- If AUTH_TOKEN is unset and the admin row already exists, the stored
token is left untouched; a warning is emitted in the legacy path
explaining how to pin a static token.
Fixes #361
routers/utils.py.validate_file_format forwarded file.filename (typed str | None) straight into core_validators.validate_file_format, which then did 'if "." in filename'. A multipart part with no filename header surfaced as TypeError → 500, masking what was a simple client-side mistake. Reject the missing filename with a 422 instead. Fixes #368
…ition_member add_partition_member auto-created a missing Partition row with whatever (user_id, role) the caller supplied. Calling add_partition_member(partition, user, role='editor') on a fresh name yielded a partition with no owner, which made every endpoint guarded by require_partition_owner permanently inaccessible to non-admins. Only allow the auto-create branch when the first member is being added as 'owner'. Other roles now raise ValueError, which the legacy actor shim returns to callers. The Phase 8 PartitionService.add_member path already enforces partition existence via _ensure_partition, so this only tightens the legacy code path that the actor still exposes. Fixes #387
add_file_to_partition built PartitionMembership(partition_name=..., user_id=user_id, role='owner') inside the partition auto-create branch with no guard that user_id is not None. PartitionMembership.user_id is NOT NULL, so calling add_file_to_partition(..., user_id=None, ...) (the default in the function signature) raised IntegrityError at commit time and left the partition row half-created in the transaction's prior state. Validate user_id at the top of the auto-create branch and raise a clean ValueError so the caller can translate it to a proper HTTP 400. Fixes #371
…ions _gen_chunk_order_metadata derived section IDs as [time.time_ns() + i for i in range(n)]. The Indexer Ray actor allows concurrent add_file calls, so two ingestions starting within the same nanosecond window produced overlapping integer ranges. Surrounding chunk navigation through prev_section_id / next_section_id then crossed document boundaries, returning chunks from a different file when callers requested context expansion. Use a cryptographically random 60-bit base so two concurrent ingestions can never produce overlapping ranges. Birthday-collision probability over the 60-bit space at realistic ingestion volumes is effectively zero. The prev/next navigation only requires ordering *within* a single call, so the lost wall-clock-monotonic property across calls is fine. Fixes #356
The .eml loader's attachment processing path resolved each attachment's extension via self.loader_classes.get(...), instantiated the resulting class, and called aload_document with no depth counter and no visited guard. A nested chain of .eml-in-.eml attachments therefore caused stack overflow, file-descriptor exhaustion, and finally a confusing crash long after the indexer had consumed significant resources. Thread a _eml_recursion_depth keyword through aload_document, stop descending before reaching MAX_EML_RECURSION_DEPTH (5), and annotate the partial output so operators see exactly where the chain was cut. Fixes #364
……] tag stream_with_source_filtering defaulted buffer_size to 100 chars. The LLM's [Sources: 1, 2, …, N] tag at the end of a response easily exceeds 100 chars when many sources are cited, in which case its leading characters are flushed to the client before the trailing-anchored regex can match. Users then see raw [Sources: ...] text in the response and citation filtering falls back to all sources instead of the cited subset. Compute the buffer size from the maximum possible tag length (every source cited, with wrapping characters and a margin for whitespace) and apply it dynamically per request, while still allowing callers to override explicitly. Existing tests that pin buffer_size implicitly continue to work because the new default is >= 100. Fixes #390
sanitize_next_url accepted any value starting with '/' and not '//', which let through path-relative inputs like '/\\evil.com/path' (Chrome / Edge parse as host-relative to evil.com) and CRLF-containing values that flowed into the Location header on the OIDC redirect. Both are open-redirect / response-splitting vectors against authenticated sessions. Strip is replaced by reject-on-presence (any CR/LF/NUL triggers fallback), the second-character check rejects '/' and '\', and the remaining path is re-parsed with urlparse to assert empty scheme and netloc before being returned. Fixes #360
app_front.py resolved CHAINLIT_AUTH_SECRET via os.environ.get with 'default_secret_for_openrag_ui' as the fallback (and the warning that previously flagged this was commented out). Any deployment that forgot to set the env var signed Chainlit session JWTs with a globally known, source-published secret, which let anyone forge a valid session for any user without needing the IdP or a real token. Drop the literal default and raise on startup if the variable is unset. The error message tells the operator exactly how to generate one. Fixes #380
…RWARDED_ALLOW_IPS
uvicorn.run set proxy_headers=True but left forwarded_allow_ips at its
default ('127.0.0.1'), so any reverse proxy that does not share the
loopback was discarded along with its X-Forwarded-Proto. As a result,
request.url.scheme stayed 'http', _is_request_secure returned False,
and the OIDC openrag_session and state cookies were written with
Secure=False even on real HTTPS deployments — exposing them to
plaintext exfiltration.
Read the trusted-proxy CIDR list from UVICORN_FORWARDED_ALLOW_IPS,
default it to '127.0.0.1' to preserve the current docker-compose
behavior, and log the value at startup so operators notice when it's
wrong.
Fixes #384
users.external_user_id is unique and nullable. Postgres treats NULL as distinct under UNIQUE, so multiple users with no external id are fine — but an empty string '' is a regular value and the second insert raises UniqueViolation. The fix is to coerce '' (after strip) to NULL on the insert path so accidental empty strings don't break user creation. Fixes #121
get_file_ancestors builds a recursive CTE walking parent_id upward. When max_ancestor_depth=None the recursion had no depth limit and no visited set, so a self-referential or cyclic parent_id chain (A->B->A) looped until the database aborted the query, hanging the request and tying up a Postgres backend. Apply a hard ceiling (1000) regardless of the caller-supplied value: None falls back to the ceiling, and explicit values are clamped to the same constant so a misconfigured caller can't bypass it. The cap is deliberately well above realistic deep document hierarchies. Applied to both the legacy sync helper in vectordb/utils.py and the new async PgDocumentRepository. Fixes #369
… add mismatch test
refactor(12A): move logger + monitoring to core/, repoint imports
Move Dockerfile -> infra/docker/api.Dockerfile and Dockerfile.ray -> infra/docker/ray.Dockerfile, and entrypoint.sh -> infra/scripts/. The build context stays at the repo root, so the COPY for entrypoint becomes infra/scripts/entrypoint.sh and a build-context comment is added to each Dockerfile.
Relocate docker-compose.yaml, .env.example, .env.ollama, the Grafana and Prometheus configs (from openrag_metrics/) and milvus.yaml (from vdb/) under infra/compose/. The standalone metrics stack becomes monitoring.docker-compose.yaml. Update the compose file's internal paths: build context -> ../.. with dockerfile infra/docker/api.Dockerfile, the milvus include -> milvus/milvus.yaml, extern includes and bind-mount sources -> ../../ to resolve from the new location.
Relocate ansible/ -> infra/ansible/, charts/ -> infra/charts/ and quick_start/ -> infra/quick_start/. quick_start is self-contained (its own vdb/ and extern/ subtrees), so its internal relative paths are unaffected.
Relocate the prompt templates from prompts/example1/*.txt to the package at openrag/prompts/templates/*.txt and add openrag/prompts/__init__.py, which reads them into DEFAULT_SEEDS (keyed by filename stem) at import time as the first-boot fallback for DB-stored per-partition overrides. The PathsConfig.prompts_dir default now resolves package-relative via __file__, so it works regardless of CWD. Drop the hardcoded prompts_dir from conf/config.yaml and the PROMPTS_DIR overrides from the Helm values and the api-test harness so the bundled default applies; repoint pytest's PROMPTS_DIR to openrag/prompts/templates. Both Dockerfiles no longer COPY prompts/ — the templates ship inside the package via COPY openrag/.
Provide an ergonomic top-level ui/ path for the admin frontend. The indexer-ui submodule stays registered at its .gitmodules path under extern/; ui/ is a relative symlink so it resolves on any checkout.
Exclude infra/ and tests/load/ from ruff, and add a package-data entry so the prompt templates under openrag/prompts/templates/ (read at import time) ship inside the built wheel/sdist rather than being skipped as non-.py files.
Point the build workflows at infra/docker/api.Dockerfile and infra/docker/ray.Dockerfile, the helm workflow at infra/charts/openrag-stack, and the api-test harness build at infra/docker/api.Dockerfile. The integration/unit/lint/layer-guard workflows need no change: they target self-contained compose stacks or paths that did not move.
The Ray cluster config is deployment infrastructure; relocate it alongside the helm charts. The other root-level deployment files (Dockerfile, Dockerfile.ray, docker-compose.yaml, entrypoint.sh, .env.example, .env.ollama) were already relocated when infra/ was created.
Move the standalone data_indexer.py CLI from utility/ to scripts/ and drop the redundant utility/requirements.txt (httpx and loguru are already project dependencies). Repoint the helm .tgz ignore at infra/charts/ after the 13A move.
Add a Project Layout section to CLAUDE.md and point its Docker, prompt template, migration, and test-config references at the new locations. Update the README getting-started flow for infra/compose/.env.example and infra/quick_start, fix the helm/ansible and ray-cluster example paths, and mark PROMPTS_DIR optional in the doc env samples now that templates ship in the package.
…fresh docs The infra/compose/.env.example still set PROMPTS_DIR=../prompts/example1 after the templates moved into the package; copying it to .env (the documented getting-started step) injected a path that resolves to /app/prompts/example1 in the container, which no longer exists, so load_template raised FileNotFoundError and the API crashed on startup. Comment it out and document that the bundled templates are used unless PROMPTS_DIR overrides them. Move the pytest configuration from pytest.ini into pyproject.toml [tool.pytest.ini_options] and delete pytest.ini, preserving testpaths/pythonpath/python_files/env/markers and adding --strict-markers. Asyncio stays in strict mode since the suite uses explicit markers. Repoint the env_vars and milvus_migration docs at openrag/prompts/templates and infra/compose/milvus/milvus.yaml.
- fix stale package paths in CLAUDE.md (components/routers/.hydra_config
-> core/services/api/di + conf/config.yaml); refresh test command docs
- move automatic-evaluation-pipeline/ into tests/load/
- remove orphaned root vdb/ and stale openrag/{components,config,utils,tests}
leftovers; ignore db/, vdb/, coverage artifacts
- add asyncio_mode=auto and --tb=short to pytest config
- add tests/unit/conftest.py (mock ports) and tests/unit/api/conftest.py
(ASGI client) prescribed by the phase 13 plan
- skip the pydub audio test on python 3.13 (audioop removed)
refactor: reorganize project layout for phase 13
|
Important Review skippedToo many files! This PR contains 299 files, which is 149 over the limit of 150. To get a review, narrow the scope: ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (299)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Port the FRAMES benchmark eval scripts (PR #305) into the new layout at tests/load/automatic-evaluation-pipeline/. Standalone HTTP clients against the OpenRAG API: - setup_frames.py: download the Wikipedia articles referenced by FRAMES (md/pdf/html) and index them into a partition. - eval_frames.py: run the 824-question benchmark with LLM-judge scoring and fuzzy exact match, in RAG / no-rag / oracle / gold-workspaces modes.
Audit fixes for the ported FRAMES eval pipeline: - setup_frames: include CANCELLED in TERMINAL_STATES and bound the upload status poll with MAX_POLL_SECONDS, so a cancelled or stuck indexing task can no longer spin the poll loop forever. - setup_frames: parse Retry-After as either a number of seconds or an HTTP-date, falling back to backoff instead of raising ValueError and silently dropping the article. - setup_frames: guard a missing task_status_url instead of crashing on None. - both scripts: derive the empty-title fallback filename from a stable md5 hash rather than the per-process-salted built-in hash(), so eval_frames resolves the same name setup_frames wrote (oracle and gold-file matching). - eval_frames: validate MODEL/BASE_URL/API_KEY (and judge variants) up front with a clear message instead of failing opaquely inside ChatOpenAI. - eval_frames: run the OpenRAG health check in default RAG mode too. - eval_frames: tolerate a rejected workspace file attachment (some gold articles may be un-indexed) instead of aborting the whole run.
Pure-logic unit tests (Wikipedia parsing, exact-match scoring, oracle context, filename stability) plus respx-mocked HTTP tests covering every API-calling function. The mocked tests pin the audit fixes: CANCELLED as a terminal upload state, the bounded poll timeout, the missing task_status_url guard, tolerant workspace file attachment, Retry-After date parsing, and LLM-env validation. Includes an opt-in read-only live contract test gated on OPENRAG_SMOKE_URL / OPENRAG_SMOKE_TOKEN.
setup_frames.extract_all_titles fed glued multi-URL strings straight to extract_title_from_url, producing one garbage title and silently dropping the remaining articles — while eval_frames.parse_wiki_links splits them. Rows with several Wikipedia URLs in one wiki_links entry therefore went un-downloaded yet were expected by oracle / gold-workspace modes (logged as missing). Add the same parse_wiki_links (with the embedded-URL splitter) to setup_frames and route extract_all_titles through it, so both scripts resolve an identical article set. Tests assert the two parsers agree, including the glued-URL case.
Drop references to the change history (audit-fix labels, prior implementations, cross-script wording) in favor of objective descriptions of what the code does.
Replace the ordinal 'first run' / 'first time only' wording with the actual trigger (no local cache present).
Structured logging instead of raw prints, boundary-correct limit checks, positive concurrency validation, and --from-results control flow optimization: eval_frames.py: - Replace print calls with logger.info/error - Fix 'if limit:' to 'if limit is not None:' (line 160) - Add _positive_int validator for --concurrency and --judge-concurrency - Restructure main to load dataset only when needed; --from-results now runs before dataset loading to avoid unnecessary cache/network operations setup_frames.py: - Replace print calls with logger.info/error - Fix 'if limit:' to 'if limit is not None:' (line 90) - Add _positive_int validator for --concurrency and --wiki-concurrency Scripts remain standalone HTTP clients (loguru only, no openrag package imports).
…hexagonal feat(eval): add FRAMES benchmark evaluation pipeline
…lling task state
The test was reading /queue/info immediately after DELETE /indexer/task/{task_id},
before the TaskStateManager had a chance to bump the total_cancelled counter
asynchronously. Added wait_for_task_cancellation helper to poll until the task
state reaches CANCELLED, ensuring the counter update happens before assertion.
…t divergence (#378, #379) Reorder delete operations so relational data is cleaned up before vector store deletion. If vector store delete fails after database cleanup succeeds, the data model remains consistent (data is gone from database, orphaned chunks logged for reconciliation). Changes: - dispatcher.delete_file: remove from workspaces/database first, then Milvus - partition_service.delete_partition: delete from database first, then Milvus - Add error handling and structured logging for reconciliation tasks - Add regression tests verifying correct operation order and error handling Fixes #378 (replace-file keeps old chunks if PG fails) Fixes #379 (partition and file deletes diverge Milvus from PG)
No description provided.