Description
onnxruntime-openvino 1.24.1 segfaults immediately when creating an InferenceSession with OpenVINOExecutionProvider on an Intel Arrow Lake system with NPU. The crash occurs in the bundled libopenvino.so.2541 (OpenVINO 2025.4.1) at a fixed offset during NPU plugin initialization.
CPUExecutionProvider works fine with the same model files and environment.
Environment
- onnxruntime-openvino: 1.24.1 (latest on PyPI)
- Bundled OpenVINO: 2025.4.1 (
libopenvino.so.2541)
- Python: 3.12
- OS: Ubuntu 24.04.4 LTS (LXC container on Proxmox VE 9.1.5)
- Kernel: 6.17.9-1-pve (Proxmox, host kernel)
- CPU: Intel Core Ultra 7 265 (Arrow Lake)
- NPU device:
/dev/accel0 passed through to LXC
- Kernel module:
intel_vpu 1.0.0 (built into PVE kernel 6.17.9)
- NPU userspace:
intel-level-zero-npu 1.32.0, intel-fw-npu 1.32.0, intel-driver-compiler-npu 1.32.0
- Also tested with:
intel-level-zero-npu 1.28.0 — same segfault
- Standalone openvino 2026.1.0: installed alongside ORT — no effect (ORT loads its bundled libs)
Reproduction
import onnxruntime as ort
# This segfaults:
sess = ort.InferenceSession(
"encoder_model.onnx",
providers=["OpenVINOExecutionProvider"]
)
# This works:
sess = ort.InferenceSession(
"encoder_model.onnx",
providers=["CPUExecutionProvider"]
)
The model is Moonshine Base encoder (~78MB ONNX). Any ONNX model triggers the crash — it happens during EP initialization, not model loading.
Crash Details
signal=SEGV (status=11)
segfault at 0 ip <offset 0xcdd56c> in libopenvino.so.2541
error 6 (write to NULL pointer in userspace)
- Crash is 100% reproducible, always at the same offset in
libopenvino.so.2541
dmesg on host confirms: python3[PID]: segfault at 0 ip ...cdd56c in libopenvino.so.2541
What We've Tried
- ✅ Upgraded NPU userspace from 1.28.0 → 1.32.0 (latest from
intel/linux-npu-driver) — still segfaults
- ✅ Installed standalone
openvino==2026.1.0 in the venv — ORT still loads its bundled libopenvino.so.2541, segfault persists
- ✅ Confirmed
/dev/accel0 exists and intel_vpu module is loaded
- ✅
ldd shows no missing libraries
- ✅ CPUExecutionProvider works perfectly as a workaround
Expected Behavior
OpenVINOExecutionProvider should either:
- Successfully initialize and use the NPU, or
- Fail gracefully with an error message instead of segfaulting
Request
- Is there a known incompatibility between ORT-OpenVINO 1.24.1's bundled OpenVINO 2025.4.1 and the
intel_vpu driver in kernel 6.17.x?
- Will the next
onnxruntime-openvino release bundle a newer OpenVINO (2026.x) that may resolve this?
- Is there a way to force ORT to use a system-installed OpenVINO instead of its bundled
libopenvino.so.2541?
Workaround
Using CPUExecutionProvider — encoder inference is 88ms on CPU vs ~50ms on NPU. Acceptable but we'd prefer NPU when available.
Description
onnxruntime-openvino 1.24.1segfaults immediately when creating anInferenceSessionwithOpenVINOExecutionProvideron an Intel Arrow Lake system with NPU. The crash occurs in the bundledlibopenvino.so.2541(OpenVINO 2025.4.1) at a fixed offset during NPU plugin initialization.CPUExecutionProviderworks fine with the same model files and environment.Environment
libopenvino.so.2541)/dev/accel0passed through to LXCintel_vpu 1.0.0(built into PVE kernel 6.17.9)intel-level-zero-npu 1.32.0,intel-fw-npu 1.32.0,intel-driver-compiler-npu 1.32.0intel-level-zero-npu 1.28.0— same segfaultReproduction
The model is Moonshine Base encoder (~78MB ONNX). Any ONNX model triggers the crash — it happens during EP initialization, not model loading.
Crash Details
libopenvino.so.2541dmesgon host confirms:python3[PID]: segfault at 0 ip ...cdd56c in libopenvino.so.2541What We've Tried
intel/linux-npu-driver) — still segfaultsopenvino==2026.1.0in the venv — ORT still loads its bundledlibopenvino.so.2541, segfault persists/dev/accel0exists andintel_vpumodule is loadedlddshows no missing librariesExpected Behavior
OpenVINOExecutionProvidershould either:Request
intel_vpudriver in kernel 6.17.x?onnxruntime-openvinorelease bundle a newer OpenVINO (2026.x) that may resolve this?libopenvino.so.2541?Workaround
Using
CPUExecutionProvider— encoder inference is 88ms on CPU vs ~50ms on NPU. Acceptable but we'd prefer NPU when available.