Skip to content

[FEATURE] Introduce Schema Validation and Confidence Scoring Layer for LLM Extraction Reliability #450

@Lochit-Vinay

Description

@Lochit-Vinay

📝 Description

FireForm currently relies on LLM-generated structured outputs from unstructured incident reports. In practice, these outputs are not always consistent — they can be incomplete, slightly malformed, or contain incorrect values.

This can cause issues in downstream steps like PDF auto-fill and affects the overall reliability of the pipeline.

This issue focuses on improving the reliability of the LLM → structured JSON → PDF flow.

💡 Rationale

LLM outputs are not guaranteed to strictly follow a schema. Some common issues observed:

  • Missing required fields
  • Incorrect data types
  • Partially structured or noisy responses

Right now, there is no dedicated validation layer to catch or handle these issues before the data is used further.

Adding a validation + scoring layer would help ensure safer and more reliable processing.

🛠️ Proposed Solution

Introduce a lightweight validation and scoring step after LLM extraction:

  • Schema-Based Validation

    • Use Pydantic models to enforce required fields and types
    • Flag missing or invalid values
  • Confidence Scoring

    • Assign a basic confidence score per field (e.g., based on parsing reliability or fallback usage)
    • Lower confidence for fallback/cleaned/uncertain values
  • Structured Error Handling

    • Standardize validation errors
    • Improve debugging and visibility into failures

This can be implemented as a modular component in the existing extraction pipeline.

✅ Acceptance Criteria

  • Extracted JSON validates against a defined schema
  • Missing/invalid fields are clearly flagged
  • Field-level confidence scores are included
  • Pipeline continues gracefully even on validation issues
  • No breaking changes to current workflow

📌 Additional Context

This would be an initial implementation focused on improving robustness.
It can later be extended with more advanced validation rules or human-in-the-loop correction if needed.

This can serve as a foundation for improving extraction quality during the GSoC development phase.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions