Skip to content

feat: add structured training telemetry schema and validation pipeline hooks#6

Open
VarshiniGunti wants to merge 1 commit intoOpenAgriNet:mainfrom
VarshiniGunti:feat-training-telemetry-schema
Open

feat: add structured training telemetry schema and validation pipeline hooks#6
VarshiniGunti wants to merge 1 commit intoOpenAgriNet:mainfrom
VarshiniGunti:feat-training-telemetry-schema

Conversation

@VarshiniGunti
Copy link
Copy Markdown

Summary

This PR introduces a structured telemetry foundation for logs-to-training workflows in oan-ai-api, aligned with agentic training requirements. It standardizes how chat and tool-use events are captured so downstream pipelines can reliably build training datasets for supervised fine-tuning and preference optimization.

What This PR Adds

  • A versioned training event schema (v1) for:
    • user
    • assistant
    • tool_call
    • tool_result
    • error
  • Runtime event emission from the chat flow in a machine-readable format:
    • training_event=<json>
  • Event validation utilities to enforce:
    • valid tool names (against registered tools)
    • required tool_call_id on tool events
    • correct tool_calltool_result linkage
  • Documentation for schema and usage
  • A CLI validator to verify extracted JSONL/log lines before downstream processing

Implementation Details

  • Added telemetry models/builders and emitter utilities in app/telemetry/
  • Instrumented app/services/chat.py to emit:
    • user event at request start
    • tool call/result events from model message parts
    • assistant event after stream completion
    • error event on exceptions
  • Added helpers/validate_training_events.py for offline validation
  • Documented schema and usage in:
    • docs/training_pipeline/log_event_schema.md
    • README.md

Why This Change Matters

This creates a consistent, auditable event contract for both Q&A and multi-step agentic interactions. It improves trace quality for training data generation and introduces guardrails to catch malformed or inconsistent tool trajectories early.

Validation

Executed:

python -m py_compile app/telemetry/events.py app/telemetry/validator.py app/services/chat.py helpers/validate_training_events.py

Compilation and syntax checks passed successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant