Skip to content

feat(e2e-eval): support pre-exported ONNX models via onnx_file#902

Open
xieofxie wants to merge 3 commits into
mainfrom
hualxie/run_onnx
Open

feat(e2e-eval): support pre-exported ONNX models via onnx_file#902
xieofxie wants to merge 3 commits into
mainfrom
hualxie/run_onnx

Conversation

@xieofxie

Copy link
Copy Markdown
Contributor

Summary

Adds support for evaluating models that ship a pre-exported ONNX (e.g. the PaddleOCR *_onnx repos) in the e2e_eval harness, bypassing the HF→ONNX export path entirely. Motivated by PaddlePaddle/PP-OCRv5_server_{det,rec}, whose native transformers architecture is unsupported by the pinned transformers/optimum-onnx stack (a hard dependency conflict: the model needs transformers>=5, while optimum-onnx caps it at <4.58).

When a registry entry declares onnx_file, the harness downloads that file from the HF repo and feeds the local path to winml config/build/perf via -m (winml's is_onnx_file_path routes it to the skip-export pipeline).

Changes

  • Registry data (models_curated.json, models_all.json): renamed PP-OCRv5_server_{det,rec}*_onnx with "onnx_file": "inference.onnx".
  • utils/registry.py: new ModelEntry.onnx_file field + load mapping.
  • build_registry.py: flows onnx_file through Phase 2 (update-in-place + new-entry) so it survives rebuilds.
  • run_eval.py:
    • _resolve_model_input() downloads the pre-exported ONNX and returns the local path (else hf_id); used for both config -m and build -m. Prepare failures degrade to a graceful per-model build failure instead of crashing the run.
    • _ensure_min_opset() upgrades sub-minimum ONNX to opset 17 (winml optimize requires >=12; PaddleOCR ships opset 11). onnxruntime can run opset 11, but the build/optimize stage rejects it.
    • ONNX-input builds use --output-dir instead of --use-cache (direct-ONNX configs carry no loader.task to key the cache), and resolve the artifact deterministically at <output-dir>/model.onnx (stdout marker parsing was unreliable — Rich wraps long paths, which silently dropped perf to a build-only false PASS).
    • --hf-model + --task now uses dataclasses.replace so onnx_file/precision/perf_args survive the task override.

Verification

  • End-to-end: run_eval.py --hf-model PaddlePaddle/PP-OCRv5_server_det_onnx --eval-type perf --device cpuPASS, real perf on the built model.onnx (5.40 samples/sec, full latency table).
  • tests/unit/eval/test_run_eval_script.py: 23 passed (added coverage for opset-upgrade, download-then-ensure-opset, and the --output-dir/-m wiring).

@xieofxie xieofxie requested a review from a team as a code owner June 16, 2026 08:20
Comment thread scripts/e2e_eval/run_eval.py Fixed
Comment thread tests/unit/eval/test_run_eval_script.py Fixed
Comment thread tests/unit/eval/test_run_eval_script.py Fixed
hualxie added 2 commits June 16, 2026 16:28
CodeQL flagged 'onnx' being imported with both 'import onnx' and
'from onnx import ...' in the same module. Use attribute access
(onnx.version_converter / onnx.helper / onnx.TensorProto) throughout
instead of mixing import styles.
The Phase 2 onnx_file passthrough was dead: load_curated_entries
dropped every key except hf_id/task/group/priority, so a rebuild
that re-added or newly-created a curated onnx_file entry would lose
it. Carry onnx_file through when present.
@xieofxie

Copy link
Copy Markdown
Contributor Author

need to update this in favor of #582

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants