Conversation
Allows for accelerated inferencing on Intel hardware. To allow for
this, a few small changes are made:
* addition of requirements/openvino.txt, which just lets pip install
the OpenVINO runtime
* util.provider, to prevent code-reuse with
* get_available_providers, which filters out supported ONNXRuntime
providers
* tasks.analysis.get_provider_options, which prevents code-reuse
* modifications to Dockerfile, to allow for that optional package
installation described earlier
deployment/.env.example
Outdated
|
|
||
| # --- OpenVINO Acceleration --- | ||
| RENDER_GID= # render group ID (use `stat -c "%g" /dev/dri/renderD128` on host to verify) | ||
| OPENVINO_CONFIG_JSON_PATH= # path to have openvino load config https://onnxruntime.ai/docs/execution-providers/OpenVINO-ExecutionProvider.html#load_config No newline at end of file |
There was a problem hiding this comment.
maybe I could add something to the docs about what normal config.json would ook like
| CLAP_ENABLED=true | ||
|
|
||
| # --- OpenVINO Acceleration --- | ||
| RENDER_GID= # render group ID (use `stat -c "%g" /dev/dri/renderD128` on host to verify) |
There was a problem hiding this comment.
gotta add to config.py still, as well as docker-compose reference files
| def get_provider_options(cuda_do_copy_in_default_stream: bool = False, | ||
| cuda_conv_algo_search_mode: str = 'EXHAUSTIVE') -> list[tuple[str, dict[str, Any]]]: |
There was a problem hiding this comment.
unsure if this is the right place to put the function, or in new util.provider
tasks/analysis.py
Outdated
| available_providers = provider.get_available_providers() | ||
| if 'OpenVINOExecutionProvider' in available_providers: | ||
| vino_options = { | ||
| 'device_type': 'AUTO', |
There was a problem hiding this comment.
might need to try other device_type... need to play with AUTO/HETERO/MULTI. am only testing on an N100 trying to get the IGP to do help out with tagging my library
ref: https://onnxruntime.ai/docs/execution-providers/OpenVINO-ExecutionProvider.html#device_type
There was a problem hiding this comment.
setting to 'MULTI:GPU,CPU' does make the process appear in intel_gpu_top. kinda curious to perf if i just force GPU only and turn off CPU execution provider
|
also needs to update docker-compose reference files and github actions build, just wanted to test the water first |
Allows for accelerated inferencing on Intel hardware. To allow for this, a few small changes are made: