Add project and organization foreign keys to async Job model #578

ankit-mehta07 · 2026-02-04T09:30:11Z

Fixes #552

Summary

Added project_id and organization_id foreign key relations to the Job model
Updated job creation logic to persist project and organization context
Added Alembic migration for new foreign key columns

Motivation

Associating async jobs with projects and organizations improves maintainability, traceability, and observability.

Summary by CodeRabbit

New Features
- Document uploads now enforce a configurable 50MB size limit with proper error responses for oversized and empty files.
- Jobs are now associated with projects and organizations for enhanced organizational tracking.
Documentation
- Updated file upload documentation with size restriction specifications and error handling details.
Improvements
- Standardized API endpoint URLs by removing trailing slashes across reset-password and user creation endpoints.

coderabbitai · 2026-02-04T09:31:48Z

📝 Walkthrough

Walkthrough

This PR adds foreign key relationships (project_id, organization_id) to the Job model with Alembic migration, introduces file upload validation with configurable size limits, and standardizes API route paths by removing trailing slashes for consistency.

Changes

Cohort / File(s)	Summary
Database Schema & Model Evolution `backend/app/alembic/versions/043_add_project_org_to_job_table.py`, `backend/app/models/job.py`, `backend/app/crud/jobs.py`	Adds organization_id and project_id as non-nullable foreign keys to Job table with cascade delete and indexes; introduces ORM relationships and updates JobCrud.create() signature to accept these IDs during job creation.
Job Service Integration `backend/app/services/llm/jobs.py`, `backend/app/services/response/jobs.py`	Updates both LLM and response job service start_job() methods to pass project_id and organization_id when creating Job records.
File Upload Validation `backend/app/services/documents/validators.py`, `backend/app/api/routes/documents.py`, `backend/app/core/config.py`, `backend/app/api/docs/documents/upload.md`	Introduces new validate_document_file() utility for file size and emptiness checks (configurable MAX_DOCUMENT_UPLOAD_SIZE_MB, 50MB default), integrated into upload endpoint with 413/422 error handling.
Route Path Standardization `backend/app/api/routes/login.py`, `backend/app/api/routes/private.py`, `backend/app/api/routes/utils.py`	Removes trailing slashes from reset-password, users, and health endpoint paths for consistent routing convention.
Test Coverage `backend/app/tests/api/routes/documents/test_route_document_upload.py`, `backend/app/tests/api/routes/test_login.py`, `backend/app/tests/api/routes/test_private.py`	Adds tests for file upload size limit exceeded (413), empty file (422), and within-limit scenarios; updates existing login and private route tests to match path standardization.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly Related PRs

Use sequential Alembic migration versions instead of hashes #479 — Both PRs modify Alembic migration versioning and sequencing, sharing the same migration revision lineage.

Suggested Labels

enhancement, ready-for-review

Suggested Reviewers

kartpop
AkhileshNegi

Poem

🐰 A job now knows its home so well,
With org and project FKs to tell,
Files are validated, sizes constrained,
Routes trimmed clean, no slashes remained,
Organization's offspring, properly named! ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Out of Scope Changes check	⚠️ Warning	The PR includes some out-of-scope changes: removing trailing slashes from API routes (/health/, /users/, /reset-password/) unrelated to the Job FK enhancement, and adding file size validation for documents.	Consider separating route normalization and document validation changes into distinct PRs to maintain clear separation of concerns and easier review/reversion if needed.
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately describes the main objective of the changeset: adding project_id and organization_id foreign keys to the Job model.
Linked Issues check	✅ Passed	The PR fully addresses issue `#552` by adding project_id and organization_id foreign keys to the Job SQLModel, updating job creation logic, creating an Alembic migration, and establishing proper database relationships.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Important

Action Needed: IP Allowlist Update

If your organization protects your Git platform with IP whitelisting, please add the new CodeRabbit IP address to your allowlist:

✨ 136.113.208.247/32 (new)
34.170.211.100/32
35.222.179.152/32

Reviews will stop working after February 8, 2026 if the new IP is not added to your allowlist.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@backend/app/alembic/versions/043_add_project_org_to_job_table.py`:
- Line 19: The Alembic hook functions are missing return type annotations;
update the function signatures for both upgrade and downgrade to include
explicit None return types (e.g., change def upgrade(): to def upgrade() ->
None: and def downgrade(): to def downgrade() -> None:) so they conform to the
project's typing guidelines; ensure you update both the upgrade and downgrade
definitions referenced in this migration file.
- Around line 22-41: The migration adds NOT NULL columns organization_id and
project_id directly which will fail if existing job rows lack values; modify
upgrade() -> None to first add both columns with nullable=True via
op.add_column, perform an explicit backfill (using op.execute to set
organization_id and project_id for existing job rows from their source/defaults
or JOINs to project/organization tables), then call op.alter_column("job",
"organization_id", nullable=False) and op.alter_column("job", "project_id",
nullable=False), and only after that add FK constraints/indexes; also add type
hints def upgrade() -> None: and def downgrade() -> None: and mirror safe
nullable/dropping steps in downgrade().

In `@backend/app/api/routes/private.py`:
- Around line 23-24: The create_user route's return type is incorrect: change
the function signature from "def create_user(user_in: PrivateUserCreate,
session: SessionDep) -> Any" to "-> UserPublic", ensure UserPublic is imported
in this module, and run type checks; additionally, standardize routing by
removing trailing slashes from inconsistent endpoints (e.g., the decorators
currently using "/test-email/" and "/evaluate_models/") so they match the
non-trailing-slash style used by "/users", "/health", and "/reset-password".

In `@backend/app/services/documents/validators.py`:
- Line 53: Update the log call that currently reads logger.info(f"Document file
validated: {file.filename} ({file_size} bytes)") so it uses the function-name
prefix and masks the filename before logging; replace it with a prefixed message
like logger.info(f"[validate_document_file] Document file validated:
{mask_string(file.filename)} ({file_size} bytes)"), referencing the existing
logger.info call, the mask_string helper, and the validate_document_file
function to locate where to change it.

🧹 Nitpick comments (2)

backend/app/api/routes/utils.py (1)

11-16: Inconsistent trailing slash on /test-email/.

While /health was updated to remove the trailing slash, /test-email/ still retains it. Consider updating for consistency with the other route changes in this PR.
Proposed fix
 `@router.post`(
-    "/test-email/",
+    "/test-email",
     dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     status_code=201,
     include_in_schema=False,
 )

backend/app/tests/api/routes/documents/test_route_document_upload.py (1)

329-424: Use a temp-file factory and settings-driven limit in these size tests.

This block hardcodes “50MB” and repeats temp-file creation. Consider a factory fixture to generate sized files and derive limits from settings.MAX_DOCUMENT_UPLOAD_SIZE_MB so tests stay aligned with config changes.

♻️ Suggested refactor

-from typing import Any
+from typing import Any, Callable
...
+@pytest.fixture
+def temp_file_factory() -> Callable[[str, int], Path]:
+    def _make(suffix: str, size_bytes: int) -> Path:
+        with NamedTemporaryFile(mode="wb", suffix=suffix, delete=False) as fp:
+            if size_bytes:
+                fp.seek(size_bytes - 1)
+                fp.write(b"\0")
+            fp.flush()
+            return Path(fp.name)
+    return _make
...
-    def test_upload_file_exceeds_size_limit(
+    def test_upload_file_exceeds_size_limit(
         self,
         db: Session,
         route: Route,
-        uploader: WebUploader,
+        uploader: WebUploader,
+        temp_file_factory: Callable[[str, int], Path],
     ) -> None:
         """Test that files exceeding the size limit are rejected."""
         aws = AmazonCloudStorageClient()
         aws.create()
 
-        # Create a file larger than the 50MB limit
-        # For testing purposes, we'll create a 51MB file
-        with NamedTemporaryFile(mode="wb", suffix=".pdf", delete=False) as fp:
-            # Write 51MB of data (51 * 1024 * 1024 bytes)
-            chunk_size = 1024 * 1024  # 1MB chunks
-            for _ in range(51):
-                fp.write(b"0" * chunk_size)
-            fp.flush()
-            large_file = Path(fp.name)
+        limit_mb = settings.MAX_DOCUMENT_UPLOAD_SIZE_MB
+        large_file = temp_file_factory(".pdf", (limit_mb + 1) * 1024 * 1024)
...
-        assert "Maximum size: 50MB" in error_data["error"]
+        assert f"Maximum size: {limit_mb}MB" in error_data["error"]
...
-    def test_upload_empty_file(
+    def test_upload_empty_file(
         self,
         db: Session,
         route: Route,
-        uploader: WebUploader,
+        uploader: WebUploader,
+        temp_file_factory: Callable[[str, int], Path],
     ) -> None:
...
-        with NamedTemporaryFile(mode="w", suffix=".txt", delete=False) as fp:
-            # Don't write anything, just create an empty file
-            fp.flush()
-            empty_file = Path(fp.name)
+        empty_file = temp_file_factory(".txt", 0)
...
-    def test_upload_file_within_size_limit(
+    def test_upload_file_within_size_limit(
         self,
         db: Session,
         route: Route,
-        uploader: WebUploader,
+        uploader: WebUploader,
+        temp_file_factory: Callable[[str, int], Path],
     ) -> None:
...
-        # Create a 1MB file (well within the 50MB limit)
-        with NamedTemporaryFile(mode="wb", suffix=".pdf", delete=False) as fp:
-            # Write 1MB of data
-            fp.write(b"0" * (1024 * 1024))
-            fp.flush()
-            normal_file = Path(fp.name)
+        normal_file = temp_file_factory(".pdf", 1024 * 1024)

As per coding guidelines: backend/app/tests/**/*.py: Use factory pattern for test fixtures in backend/app/tests/

coderabbitai · 2026-02-04T09:39:03Z

backend/app/alembic/versions/043_add_project_org_to_job_table.py

+depends_on = None
+
+
+def upgrade():


⚠️ Potential issue | 🟡 Minor

Add explicit return types to Alembic hooks.

upgrade and downgrade should explicitly return None.

🛠️ Suggested fix

-def upgrade(): +def upgrade() -> None: ... -def downgrade(): +def downgrade() -> None:

As per coding guidelines: **/*.py: Always add type hints to all function parameters and return values in Python code

Also applies to: 79-79

🤖 Prompt for AI Agents

In `@backend/app/alembic/versions/043_add_project_org_to_job_table.py` at line 19, The Alembic hook functions are missing return type annotations; update the function signatures for both upgrade and downgrade to include explicit None return types (e.g., change def upgrade(): to def upgrade() -> None: and def downgrade(): to def downgrade() -> None:) so they conform to the project's typing guidelines; ensure you update both the upgrade and downgrade definitions referenced in this migration file.

coderabbitai · 2026-02-04T09:39:03Z

backend/app/alembic/versions/043_add_project_org_to_job_table.py

+    op.add_column(
+        "job",
+        sa.Column(
+            "organization_id",
+            sa.Integer(),
+            nullable=False,
+            comment="Reference to the organization",
+        ),
+    )
+
+    # Add project_id column
+    op.add_column(
+        "job",
+        sa.Column(
+            "project_id",
+            sa.Integer(),
+            nullable=False,
+            comment="Reference to the project",
+        ),
+    )


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "043_add_project_org_to_job_table.py" -type f

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 141

🏁 Script executed:

cat -n ./backend/app/alembic/versions/043_add_project_org_to_job_table.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 2822

🏁 Script executed:

# Check if there are other migrations or code that shows if job table has existing data # Search for job table references and existing data loading rg "job" ./backend/app/alembic/versions --type py -B2 -A2 | head -50

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 4884

🏁 Script executed:

# Find the Job model definition find ./backend -name "*.py" -type f | xargs grep -l "class Job" | head -5

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 129

🏁 Script executed:

cat -n ./backend/app/models/job.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 3679

🏁 Script executed:

# Check if there are any existing jobs or test data setups find ./backend -name "*.py" -type f | xargs grep -l "Job(" | head -5

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 276

🏁 Script executed:

# Check previous migrations to see when job table was created ls -la ./backend/app/alembic/versions/ | grep -E "^\-.*\.py$" | tail -20

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 1878

🏁 Script executed:

cat -n ./backend/app/alembic/versions/029_create_job_table.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 1698

🏁 Script executed:

# Check if there are any jobs created in the codebase or seeded rg "Job(" ./backend/app/crud/jobs.py -A 3 | head -30

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 133

🏁 Script executed:

# Check for actual job creation logic to see if jobs are created during normal operation rg "status.*PENDING\|PROCESSING" ./backend/app/crud/jobs.py | head -20

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 57

🏁 Script executed:

# Check the jobs CRUD file to see how jobs are created/used head -100 ./backend/app/crud/jobs.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 1415

Backfill organization_id and project_id before enforcing NOT NULL constraints.

This migration adds NOT NULL columns to an existing job table without defaults or backfill. If any jobs exist without these foreign keys, the migration will fail. The CRUD code shows jobs are created with both IDs, but you must confirm whether existing rows lack them. Add backfill logic before the NOT NULL constraint:

def upgrade(): op.add_column( "job", sa.Column("organization_id", sa.Integer(), nullable=True, comment="Reference to the organization"), ) op.add_column( "job", sa.Column("project_id", sa.Integer(), nullable=True, comment="Reference to the project"), ) # Backfill existing rows here # op.execute(...) op.alter_column("job", "organization_id", nullable=False) op.alter_column("job", "project_id", nullable=False) # Then add FK constraints and indexes

Also add type hints to upgrade() and downgrade() functions per Python guidelines: def upgrade() -> None:.

🤖 Prompt for AI Agents

In `@backend/app/alembic/versions/043_add_project_org_to_job_table.py` around lines 22 - 41, The migration adds NOT NULL columns organization_id and project_id directly which will fail if existing job rows lack values; modify upgrade() -> None to first add both columns with nullable=True via op.add_column, perform an explicit backfill (using op.execute to set organization_id and project_id for existing job rows from their source/defaults or JOINs to project/organization tables), then call op.alter_column("job", "organization_id", nullable=False) and op.alter_column("job", "project_id", nullable=False), and only after that add FK constraints/indexes; also add type hints def upgrade() -> None: and def downgrade() -> None: and mirror safe nullable/dropping steps in downgrade().

coderabbitai · 2026-02-04T09:39:03Z

backend/app/api/routes/private.py

+@router.post("/users", response_model=UserPublic, include_in_schema=False)
 def create_user(user_in: PrivateUserCreate, session: SessionDep) -> Any:


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check for remaining trailing slashes in route definitions across the codebase rg -n '@router\.(get|post|put|patch|delete)\("[^"]+/"' backend/app/api/routes/

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 57

🏁 Script executed:

# Also check the specific file mentioned to see the current state cat -n backend/app/api/routes/private.py | head -40

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 1141

🏁 Script executed:

# Check utils.py for the /test-email/ route mentioned in the comment rg -n 'test-email' backend/app/api/routes/

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 129

🏁 Script executed:

# Get context around the /test-email/ line in utils.py rg -B 2 -A 2 'test-email' backend/app/api/routes/utils.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 200

🏁 Script executed:

# Check all routes in utils.py to see their structure cat -n backend/app/api/routes/utils.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 1149

🏁 Script executed:

# Check for /reset-password route mentioned in the review comment rg -n 'reset-password' backend/app/api/routes/

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 166

🏁 Script executed:

# Get a comprehensive list of all routes to verify the standardization claim rg -n '@router\.(get|post|put|patch|delete)\(' backend/app/api/routes/ -A 1

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 11662

Fix return type annotation: use UserPublic instead of Any.

The route path standardization (/users without trailing slash) aligns with other routes like /health and /reset-password. However, the function's return type annotation violates the type hints guideline—it should be -> UserPublic to match the response_model, not -> Any. Additionally, there are routing inconsistencies in the codebase: /test-email/ and /evaluate_models/ still have trailing slashes while other routes have removed them.

🤖 Prompt for AI Agents

In `@backend/app/api/routes/private.py` around lines 23 - 24, The create_user route's return type is incorrect: change the function signature from "def create_user(user_in: PrivateUserCreate, session: SessionDep) -> Any" to "-> UserPublic", ensure UserPublic is imported in this module, and run type checks; additionally, standardize routing by removing trailing slashes from inconsistent endpoints (e.g., the decorators currently using "/test-email/" and "/evaluate_models/") so they match the non-trailing-slash style used by "/users", "/health", and "/reset-password".

coderabbitai · 2026-02-04T09:39:03Z

backend/app/services/documents/validators.py

+            detail="Empty file uploaded"
+        )
+
+    logger.info(f"Document file validated: {file.filename} ({file_size} bytes)")


⚠️ Potential issue | 🟡 Minor

Prefix and mask the validation log message.

The info log should include the function prefix and mask the filename before logging.

🛠️ Suggested fix

- logger.info(f"Document file validated: {file.filename} ({file_size} bytes)") + logger.info( + f"[validate_document_file] Document file validated: " + f"{mask_string(file.filename)} ({file_size} bytes)" + )

As per coding guidelines: Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

logger.info(f"Document file validated: {file.filename} ({file_size} bytes)")

logger.info(

f"[validate_document_file] Document file validated: "

f"{mask_string(file.filename)} ({file_size} bytes)"

)

🤖 Prompt for AI Agents

In `@backend/app/services/documents/validators.py` at line 53, Update the log call that currently reads logger.info(f"Document file validated: {file.filename} ({file_size} bytes)") so it uses the function-name prefix and masks the filename before logging; replace it with a prefixed message like logger.info(f"[validate_document_file] Document file validated: {mask_string(file.filename)} ({file_size} bytes)"), referencing the existing logger.info call, the mask_string helper, and the validate_document_file function to locate where to change it.

ankit-mehta07 added 3 commits February 3, 2026 18:37

Add file size validation to document upload endpoint

828fa51

Remove trailing slashes from API endpoints

652adb9

Add project and organization foreign keys to async Job model

d2bfc46

coderabbitai bot reviewed Feb 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add project and organization foreign keys to async Job model #578

Add project and organization foreign keys to async Job model #578

Uh oh!

ankit-mehta07 commented Feb 4, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated Code Review Effort

Possibly Related PRs

Suggested Labels

Suggested Reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 4, 2026

Uh oh!

coderabbitai bot Feb 4, 2026

Uh oh!

coderabbitai bot Feb 4, 2026

Uh oh!

coderabbitai bot Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@router.post("/users", response_model=UserPublic, include_in_schema=False)
		def create_user(user_in: PrivateUserCreate, session: SessionDep) -> Any:

Add project and organization foreign keys to async Job model #578

Are you sure you want to change the base?

Add project and organization foreign keys to async Job model #578

Uh oh!

Conversation

ankit-mehta07 commented Feb 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated Code Review Effort

Possibly Related PRs

Suggested Labels

Suggested Reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ankit-mehta07 commented Feb 4, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 4, 2026 •

edited

Loading