Skip to content

fix: ensure database cleanup precedes vector store deletion (#378, #379)#448

Open
EnjoyBacon7 wants to merge 1 commit into
refactor/hexagonalfrom
fix/378-379-delete-order
Open

fix: ensure database cleanup precedes vector store deletion (#378, #379)#448
EnjoyBacon7 wants to merge 1 commit into
refactor/hexagonalfrom
fix/378-379-delete-order

Conversation

@EnjoyBacon7
Copy link
Copy Markdown
Collaborator

Summary

Fixes data divergence issues where Milvus and Postgres get out of sync during delete operations.

Problem

Solution

Reorder delete operations so relational database cleanup happens first, then vector store deletion:

Changes

dispatcher.delete_file (Issue #378)

  • Remove file from workspaces (transaction)
  • Remove file from document_repo (transaction)
  • Query and delete chunks from Milvus
  • Error handling: if Milvus delete fails, log for reconciliation task

partition_service.delete_partition (Issue #379)

  • Delete partition from database (transaction)
  • Query and delete chunks from Milvus
  • Error handling: if Milvus delete fails, log for reconciliation task

Data Consistency Guarantee

Success case: Both DB and Milvus cleaned up
DB succeeds, Milvus fails: File/partition gone from DB (authoritative), orphaned chunks logged for reconciliation
DB fails: Transaction rolls back, both stores remain unchanged

Test Coverage

✅ Added 4 new regression tests:

  • test_delete_file_cleans_database_before_vector_store - verifies operation order
  • test_delete_file_logs_error_if_vector_store_delete_fails - error handling
  • test_delete_partition_deletes_rows_before_vectors - verifies operation order
  • test_delete_partition_logs_error_if_vector_store_delete_fails - error handling

✅ All 1049 unit tests pass
✅ CI: API Tests ✅, Unit Tests ✅, Integration Tests ✅, Layer Guard ✅

Files Modified

  • openrag/services/workers/dispatcher.py
  • openrag/services/orchestrators/partition_service.py
  • tests/unit/services/workers/test_dispatcher.py
  • tests/unit/services/orchestrators/test_partition_service.py

Fixes #378 #379

…t divergence (#378, #379)

Reorder delete operations so relational data is cleaned up before vector store
deletion. If vector store delete fails after database cleanup succeeds, the data
model remains consistent (data is gone from database, orphaned chunks logged for
reconciliation).

Changes:
- dispatcher.delete_file: remove from workspaces/database first, then Milvus
- partition_service.delete_partition: delete from database first, then Milvus
- Add error handling and structured logging for reconciliation tasks
- Add regression tests verifying correct operation order and error handling

Fixes #378 (replace-file keeps old chunks if PG fails)
Fixes #379 (partition and file deletes diverge Milvus from PG)
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 5, 2026

Warning

Review limit reached

@EnjoyBacon7, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 3 minutes and 49 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 629786ef-9e02-46a6-b6b9-1dd2228ffa17

📥 Commits

Reviewing files that changed from the base of the PR and between e8a5580 and 3141f5c.

📒 Files selected for processing (4)
  • openrag/services/orchestrators/partition_service.py
  • openrag/services/workers/dispatcher.py
  • tests/unit/services/orchestrators/test_partition_service.py
  • tests/unit/services/workers/test_dispatcher.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/378-379-delete-order

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant