refactor: replace status table with DeriveActiveEnum#2362
Conversation
…n overhead Replace the static `status` database table (5 rows) with a PostgreSQL enum type backed by SeaORM's `DeriveActiveEnum`. This eliminates JOIN operations against the status table in every query that resolves purl_status or product_status rows, and removes the DbContext status cache in the ingestor. Key changes: - Transform entity/src/status.rs from DeriveEntityModel to DeriveActiveEnum - Change purl_status.status_id (Uuid FK) to purl_status.status (Status enum) - Change product_status.status_id (Uuid FK) to product_status.status (Status enum) - Remove all JOINs against the status table in SeaORM and raw SQL queries - Replace DbContext cache with direct Status::from_str() parsing - Add migration that converts existing data via slug-based join before dropping the status table and creating the enum type - Update UUID computation to use status.to_string().as_bytes() The HTTP API remains unchanged — status values in JSON responses are preserved as slug strings through strum's snake_case serialization. A full re-ingestion of advisories is recommended after migration to normalize primary keys (UUID computation changed from Uuid bytes to string bytes). Implements TC-4488 Assisted-by: Claude Code
Reviewer's GuideRefactors status handling from a separate status table with UUID FKs to an inline PostgreSQL enum exposed via SeaORM’s DeriveActiveEnum, removes the ingestor’s status cache and DB lookups, simplifies queries by eliminating joins to status, and adds a one-way migration that converts existing FK-based data to the new enum representation while adjusting UUID derivation semantics. Sequence diagram for CSAF status resolution in the ingestorsequenceDiagram
actor CSAFIngestor
participant StatusCreator
participant DbStatusParser as DbContext_parse_status
participant Status
CSAFIngestor->>StatusCreator: process product_statuses
loop for each product_status
StatusCreator->>DbStatusParser: parse_status(product.status)
DbStatusParser->>Status: from_str(status)
Status-->>DbStatusParser: Result<Status, Error>
DbStatusParser-->>StatusCreator: Result<Status, Error>
StatusCreator->>StatusCreator: create_purl_status(product, purl, scheme, spec, Status)
end
Flow diagram for m0002200_replace_status_with_enum migrationflowchart TD
A[Start migration m0002200_replace_status_with_enum] --> B[Add status_text text columns to purl_status and product_status]
B --> C[Migrate slugs from status table into status_text via UPDATE ... FROM status]
C --> D[Drop status_id columns from purl_status and product_status]
D --> E[Drop status table]
E --> F[Create PostgreSQL enum type status if not exists]
F --> G[Add status status columns to purl_status and product_status]
G --> H[Cast status_text into enum: UPDATE ... SET status = status_text::status]
H --> I[Set status columns NOT NULL]
I --> J[Drop status_text columns]
J --> K[End migration]
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The
m0002200_replace_status_with_enummigration assumes that everystatus_idinpurl_status/product_statushas a corresponding row instatus; if any orphaned IDs exist,status_text(and thenstatus) will remain NULL and the laterSET NOT NULLwill fail — consider either asserting row counts or using a defensive default/fallback for unmapped statuses before enforcing NOT NULL. - Now that
Statusis a first-class enum, the codebase mixes raw string literals ('fixed','not_affected', etc.) and enum usages across SQL and Rust; consolidating these into a small helper or constants (e.g., for building SQL fragments) would reduce the risk of drift between enum variants and hard-coded status strings.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `m0002200_replace_status_with_enum` migration assumes that every `status_id` in `purl_status`/`product_status` has a corresponding row in `status`; if any orphaned IDs exist, `status_text` (and then `status`) will remain NULL and the later `SET NOT NULL` will fail — consider either asserting row counts or using a defensive default/fallback for unmapped statuses before enforcing NOT NULL.
- Now that `Status` is a first-class enum, the codebase mixes raw string literals (`'fixed'`, `'not_affected'`, etc.) and enum usages across SQL and Rust; consolidating these into a small helper or constants (e.g., for building SQL fragments) would reduce the risk of drift between enum variants and hard-coded status strings.
## Individual Comments
### Comment 1
<location path="migration/src/m0002200_replace_status_with_enum.rs" line_range="100-101" />
<code_context>
+ ALTER TABLE product_status
+ ADD COLUMN IF NOT EXISTS status status;
+
+ UPDATE purl_status SET status = status_text::status;
+ UPDATE product_status SET status = status_text::status;
+
+ ALTER TABLE purl_status
</code_context>
<issue_to_address>
**issue (bug_risk):** Guard against invalid or unexpected `status_text` values when casting to the new enum.
This assumes every non-NULL `status_text` is a valid `status` enum value. Any historical typos or unexpected values will cause `status_text::status` to raise an error and abort the migration. Consider restricting the UPDATE to known-good values and handling the rest separately, or running a pre-check SELECT to detect unknown slugs and fail early with a clear error before applying schema changes.
</issue_to_address>
### Comment 2
<location path="modules/fundamental/src/purl/endpoints/test.rs" line_range="488" />
<code_context>
.unwrap();
- assert_eq!(vuln["status"], "custom_status");
+ assert_eq!(vuln["status"], "Recommended");
Ok(())
</code_context>
<issue_to_address>
**issue (bug_risk):** The expected status string in the assertion is likely incorrect and should match the enum’s serialized/Display form (snake_case).
Given `Status` uses `#[strum(serialize_all = "snake_case")]` and `#[serde(rename_all = "snake_case")]`, the serialized value will be `"recommended"`, not `"Recommended"`. This makes the current assertion inconsistent with the API response. Please assert against the enum’s serialized form instead, for example:
```rust
assert_eq!(vuln["status"], Status::Recommended.to_string());
```
(or the literal `"recommended"`), so the test stays aligned with the enum’s representation.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Verification Report for TC-4488 (commit 7756a1b)
Overall: WARNSub-tasks created:
Root-cause analysis:
Dismissed feedback:
This comment was AI-generated by sdlc-workflow/verify-pr v0.9.0. |
ctron
left a comment
There was a problem hiding this comment.
In general, the new enum type should be used, rather then falling back to strings.
Important is that the ID (UUID v5) issue is sorted out.
| )] | ||
| #[sea_orm(rs_type = "String", db_type = "Enum", enum_name = "status")] | ||
| #[strum(serialize_all = "snake_case")] | ||
| #[serde(rename_all = "snake_case")] |
There was a problem hiding this comment.
Add some test to ensure that the values of SeoORM align with the values or strum and serde.
There was a problem hiding this comment.
[sdlc-workflow/verify-pr] Classified as code change request — sub-task TC-4508 created to address this feedback.
|
|
||
| #[async_trait::async_trait] | ||
| impl MigrationTrait for Migration { | ||
| async fn up(&self, manager: &SchemaManager) -> Result<(), DbErr> { |
There was a problem hiding this comment.
Wouldn't it be possible to create the new type with a different name, and then rename the type? Making the migration simpler in the process.
There was a problem hiding this comment.
[sdlc-workflow/verify-pr] Classified as suggestion — proposes alternative migration approach (create enum with different name, then rename). The current approach works correctly. No sub-task created.
| manager, | ||
| StatusEnum::Table, | ||
| [ | ||
| StatusEnum::Affected, |
There was a problem hiding this comment.
Instead of having an explicit list, use strum enumerator.
There was a problem hiding this comment.
[sdlc-workflow/verify-pr] Classified as code change request — sub-task TC-4508 created to address this feedback.
|
|
||
| async fn down(&self, _manager: &SchemaManager) -> Result<(), DbErr> { | ||
| Err(DbErr::Migration( | ||
| "Cannot reverse status table to enum migration".to_string(), |
There was a problem hiding this comment.
Yes, this can be reversed. And has to be.
There was a problem hiding this comment.
[sdlc-workflow/verify-pr] Classified as code change request — sub-task TC-4506 created to implement migration down() and fix test_migrations CI failure.
| } | ||
|
|
||
| /// Verifies that a custom vulnerability status is reflected in the recommendation response. | ||
| /// Verifies that a non-default vulnerability status is reflected in the recommendation response. |
There was a problem hiding this comment.
I'm not sure this test still makes sense. As having an enum removes the whole "custom status" case.
There was a problem hiding this comment.
[sdlc-workflow/verify-pr] Classified as code change request — sub-task TC-4508 created to address this feedback.
|
|
||
| pub fn uuid(&self, advisory_id: Uuid, vulnerability_id: String) -> Uuid { | ||
| let mut result = Uuid::new_v5(&NAMESPACE, self.status.as_bytes()); | ||
| let mut result = Uuid::new_v5(&NAMESPACE, self.status.to_string().as_bytes()); |
There was a problem hiding this comment.
Same here:
That could be a problem. As the UUID is generated with different input, those IDs (the ProductStatus id) would need to be migrated too. .as_bytes() is using the bytes of the UUID, not of the slug (string) representation.
There was a problem hiding this comment.
[sdlc-workflow/verify-pr] Classified as code change request — sub-task TC-4508 created to address this feedback.
| } | ||
|
|
||
| /// Creator for batch insertion of PURL statuses | ||
| /// |
There was a problem hiding this comment.
Don't remove existing documentation, unless it's wrong.
There was a problem hiding this comment.
[sdlc-workflow/verify-pr] Classified as code change request — sub-task TC-4507 created to restore removed documentation.
| Self::default() | ||
| } | ||
|
|
||
| /// Add a PURL status entry to be created |
There was a problem hiding this comment.
Don't remove existing documentation, unless it's wrong.
There was a problem hiding this comment.
[sdlc-workflow/verify-pr] Classified as code change request — sub-task TC-4507 created to restore removed documentation.
| self.entries.push(entry); | ||
| } | ||
|
|
||
| /// Create all collected PURL statuses in batches |
There was a problem hiding this comment.
Don't remove existing documentation, unless it's wrong.
There was a problem hiding this comment.
[sdlc-workflow/verify-pr] Classified as code change request — sub-task TC-4507 created to restore removed documentation.
| .await | ||
| .get_status_id(product.status, connection) | ||
| .await?; | ||
| let status = crate::graph::db_context::parse_status(product.status)?; |
There was a problem hiding this comment.
Don't prefix types, import them.
There was a problem hiding this comment.
[sdlc-workflow/verify-pr] Classified as code change request — sub-task TC-4507 created to fix import style.
Verification Report for TC-4488 (commit 7756a1b) — Final
Overall: WARNReview feedback sub-tasks:
Root-cause tasks:
Dismissed feedback:
This comment was AI-generated by sdlc-workflow/verify-pr v0.9.0. |
Add a pre-check assertion that verifies zero NULL status_text values remain after the data migration join, before casting to the enum type. This catches orphaned status_id references with a clear error message instead of an opaque SET NOT NULL failure. Also fixes cargo fmt formatting in csaf/creator.rs. Implements TC-4491 Assisted-by: Claude Code
Verification Report for TC-4488 (commit 9f2d2ec)
Overall: FAILReview feedback sub-tasks (4):
Root-cause tasks (4):
Dismissed:
This comment was AI-generated by sdlc-workflow/verify-pr v0.9.0. |
Replace String with the Status enum type in internal struct fields and function parameters. String conversion now happens only at the API response serialization boundary, preserving type safety throughout the internal layers. Implements TC-4505 Assisted-by: Claude Code
Implement the reverse migration: recreate the status table with seed data, add status_id FK columns populated from enum values, then drop the enum columns and type. This fixes the test_migrations CI failure which requires refresh (down+up) to succeed. Implements TC-4506 Assisted-by: Claude Code
Restore doc comments on PurlStatusCreator struct and its add()/create() methods that were inadvertently removed during the status enum refactor. Import parse_status at the top of csaf/creator.rs instead of using the fully-qualified crate::graph::db_context::parse_status path. Implements TC-4507 Assisted-by: Claude Code
- Add serialization alignment test verifying SeaORM, strum, and serde produce identical values for all Status variants - Document why migration uses explicit enum list (circular dependency) - Update test doc comment to reflect enum-only status values - Add SQL comment explaining raw string literal usage in enum comparison - Document UUID computation change in purl_status/product_status uuid() Implements TC-4508 Assisted-by: Claude Code
Summary
statusdatabase table (5 rows) with a PostgreSQL enum type backed by SeaORM'sDeriveActiveEnum, eliminating all JOIN operations against the status tableDbContextstatus cache in the ingestor — status values are now parsed directly from strings viaStatus::from_str()status_idFK columns to inline enum columns via slug-based data migrationDetails
Entity layer:
status.rstransformed fromDeriveEntityModel(Model/Entity/Column/Relation) to a standaloneDeriveActiveEnumenum.purl_statusandproduct_statusentities changed fromstatus_id: UuidFK tostatus: Statusenum column.Ingestor layer:
DbContextstruct simplified to a singleparse_status()function. Allget_status_id()calls replaced withStatus::from_str(). CSAF creator no longer needs async DB lookups for status resolution.Query layer: All
JOIN status ON ...removed from SeaORM queries and raw SQL. Status slug values now come directly from the enum column (status::textcast in raw SQL,.to_string()in Rust).Migration: Creates the PostgreSQL enum type after dropping the old table (avoids namespace conflict), using a temporary text column to preserve data during the transition. Includes a pre-check assertion that validates no NULL status_text values remain before the enum cast.
Note: UUID computation for purl_status/product_status rows changed from
Uuid::as_bytes()toStatus::to_string().as_bytes(). A full re-ingestion of advisories is recommended after migration to normalize primary keys.Implements TC-4488