Single GPU benchmark scripts#15514
Conversation
davidrohr
left a comment
There was a problem hiding this comment.
Didn't check anything in detail, but the things that immediately came to my mind
|
|
||
| # ROCm library injection is only useful for HIP runs. Keep it off by default for CUDA/NVIDIA containers, | ||
| # because mixed AMD/NVIDIA hosts can otherwise leak ROCm libraries into LD_LIBRARY_PATH. | ||
| if [[ "${GPUTYPE:-}" == "HIP" && "0$BENCH_AUTO_ROCM_LIBS" == "01" ]]; then |
There was a problem hiding this comment.
With new bash you can just use $BENCH_AUTO_ROCM_LIBS == 1
|
|
||
| export DPL_REPORT_PROCESSING="${DPL_REPORT_PROCESSING:-1}" | ||
|
|
||
| export FST_TMUX_NO_EPN="${FST_TMUX_NO_EPN:-1}" |
There was a problem hiding this comment.
not needed, since start_tmux.sh is not used
| # ---------------------------------------------------------------------------------------------------------------------- | ||
| # Locate original workflow script. Keep the original untouched. | ||
|
|
||
| : "${GEN_TOPO_MYDIR:=$(dirname "$(realpath "$0")")}" |
There was a problem hiding this comment.
Why don't you simple use $O2_ROOT/dpl-workflow.sh?
| export WORKFLOW_PARAMETERS="${WORKFLOW_PARAMETERS:-GPU,CTF}" | ||
| export GPUTYPE="${GPUTYPE:-CUDA}" | ||
| export NGPUS=1 | ||
| export NUMAGPUIDS=1 |
There was a problem hiding this comment.
NUMAGPUIDS and NUMAID should not be set, if not using NUMA pinning
| export EPNSYNCMODE="${EPNSYNCMODE:-0}" | ||
| export SYNCMODE="${SYNCMODE:-1}" | ||
| export SYNCRAWMODE="${SYNCRAWMODE:-0}" | ||
|
|
||
| export TIMEFRAME_RATE_LIMIT="${TIMEFRAME_RATE_LIMIT:-5}" | ||
| export GEN_TOPO_NO_TF_RATE_UPSCALING="${GEN_TOPO_NO_TF_RATE_UPSCALING:-1}" | ||
|
|
||
| export DISABLE_ROOT_OUTPUT="${DISABLE_ROOT_OUTPUT:-1}" | ||
|
|
||
| # Double pipeline requires zsraw input. Therefore default to raw TF input, not CTF. | ||
| export CTFINPUT="${CTFINPUT:-0}" | ||
| export RAWTFINPUT="${RAWTFINPUT:-1}" | ||
| export DIGITINPUT="${DIGITINPUT:-0}" | ||
| export EXTINPUT="${EXTINPUT:-0}" |
There was a problem hiding this comment.
Why do you redefine all the defaults that come from setenv.sh?
I would only set those settings, which you need.
That should be
SYNCMODE=1
TIMEFRAME_RATE_LIMIT=5
RAWTFINPUT=1
| source "$PWD/local_env.sh" | ||
| fi | ||
|
|
||
| export ALICE_O2_FST="${ALICE_O2_FST:-1}" |
There was a problem hiding this comment.
This is a hack for running on MI100, I would not put it in this script
|
|
||
| export ALICE_O2_FST="${ALICE_O2_FST:-1}" | ||
|
|
||
| if [[ -f "$GEN_TOPO_MYDIR/setenv.sh" ]]; then |
There was a problem hiding this comment.
dpl-workflow.sh will source setenv.sh, why do you source it here?
| # Let O2/core dumps land in the benchmark run directory, not in the original working directory. | ||
| export CORE_DUMP_DIR="${CORE_DUMP_DIR:-$RUNDIR}" | ||
| export O2_CORE_DUMP_DIR="${O2_CORE_DUMP_DIR:-$RUNDIR}" | ||
| export FAIRMQ_SHM_MONITOR_CONFIG="${FAIRMQ_SHM_MONITOR_CONFIG:-}" |
There was a problem hiding this comment.
We do not run the SHM MONITOR, why do you need this?
| (has_detector_reco ITS && ! has_detector_gpu ITS) && ! has_detector_from_global_reader ITS && add_W o2-its-reco-workflow "$ITS_CONFIG $ITS_STAGGERED $DISABLE_MC ${DISABLE_DIGIT_CLUSTER_INPUT:-} $DISABLE_ROOT_OUTPUT --pipeline $(get_N its-tracker ITS REST 1 ITSTRK),$(get_N its-clusterer ITS REST 1 ITSCL)" "$ITS_CONFIG_KEY;$ITSMFT_STROBES;$ITSEXTRAERR" | ||
| [[ ${DISABLE_DIGIT_CLUSTER_INPUT:-} =~ "--digits-from-upstream" ]] && has_detector_gpu ITS && ! has_detector_from_global_reader ITS && add_W o2-its-reco-workflow "--disable-tracking ${DISABLE_DIGIT_CLUSTER_INPUT:-} $ITS_STAGGERED $DISABLE_MC $DISABLE_ROOT_OUTPUT --pipeline $(get_N its-clusterer ITS REST 1 ITSCL)" "$ITS_CONFIG_KEY;$ITSMFT_STROBES;$ITSEXTRAERR" | ||
| (has_detector_reco TPC || has_detector_ctf TPC) && ! has_detector_from_global_reader TPC && add_W o2-gpu-reco-workflow "--gpu-reconstruction \"$GPU_CONFIG_SELF\" --input-type=$GPU_INPUT $DISABLE_MC --output-type $GPU_OUTPUT $([[ $TPC_CORR_OPT == *--disable-ctp-lumi-request* ]] && echo --disable-ctp-lumi-request) $ITS_STAGGERED --pipeline gpu-reconstruction:${N_TPCTRK:-1},gpu-reconstruction-prepare:${N_TPCTRK:-1} $GPU_CONFIG" "GPU_global.deviceType=$GPUTYPE;GPU_proc.debugLevel=0;$GPU_CONFIG_KEY;$TRACKTUNETPCINNER;" | ||
| (has_detector_reco TPC || has_detector_ctf TPC) && ! has_detector_from_global_reader TPC && add_W o2-gpu-reco-workflow "--gpu-reconstruction \"$GPU_CONFIG_SELF\" $MSLOG --input-type=$GPU_INPUT $DISABLE_MC --output-type $GPU_OUTPUT $([[ $TPC_CORR_OPT == *--disable-ctp-lumi-request* ]] && echo --disable-ctp-lumi-request) $ITS_STAGGERED --pipeline gpu-reconstruction:${N_TPCTRK:-1},gpu-reconstruction-prepare:${N_TPCTRK:-1} $GPU_CONFIG" "GPU_global.deviceType=$GPUTYPE;GPU_proc.debugLevel=0;$GPU_CONFIG_KEY;$TRACKTUNETPCINNER;" |
There was a problem hiding this comment.
Instead of modifying dpl-workflow.sh, you can just set
ARGS_EXTRA_PROCESS_o2_gpu_reco_workflow="--log-timestamp-us"
in your benchmark script.
|
|
||
| export DPL_REPORT_PROCESSING="${DPL_REPORT_PROCESSING:-1}" | ||
| export WORKFLOW_PARAMETERS="${WORKFLOW_PARAMETERS:-GPU,CTF}" | ||
| export GPUTYPE="${GPUTYPE:-CUDA}" |
There was a problem hiding this comment.
Perhaps I would not set CUDA here, but would request the user to set it, since the script is supposed to work equally for CUDA and for HIP. Just to avoid user error, if the user doesn't provide it.
| export O2_GPU_DOUBLE_PIPELINE="${O2_GPU_DOUBLE_PIPELINE:-1}" | ||
| export O2_GPU_RTC="${O2_GPU_RTC:-1}" | ||
| export SYNCMODE="${SYNCMODE:-1}" | ||
| export DISABLE_ROOT_OUTPUT="${DISABLE_ROOT_OUTPUT:-1}" |
There was a problem hiding this comment.
DISABLE_ROOT_OUTPUT is alrady enabled by default.
So you can remove it here.
(And btw, for this setting to correct should be DISABLE_ROOT_OUTPUT="--disable-root-output")
|
|
||
| export RUN_BENCHMARK="${RUN_BENCHMARK:-0}" | ||
|
|
||
| echo "# Alien/JAliEn environment check:" |
There was a problem hiding this comment.
I still don't understand why we need this alien token magic. If alien-token-info finds the token before running this script, that should be all that is needed?
davidrohr
left a comment
There was a problem hiding this comment.
Looks mostly good now. I have only 2 additional comments.
And I want to run it as test and compute the throughput manually, and compare with what the scripts measures, as validation. Or have you already done that?
| # Benchmark defaults. All can be overridden by exporting variables before calling this script. | ||
|
|
||
| case "${GPUTYPE:-}" in | ||
| CUDA|HIP) |
There was a problem hiding this comment.
Why don't you allow OpenCL or CPU? If someone wants to measure that for comparison? Youl could just check if GPUTYPE is set?
There was a problem hiding this comment.
I will include OpenCL and CPU in the options, but want an early failure in case the user specifies an incorrect type (e.g. a typo)
There was a problem hiding this comment.
OK, but then you have to check for the string "OCL" :)
| trap cleanup_rundir EXIT | ||
|
|
||
| # Let O2/core dumps land in the benchmark run directory, not in the original working directory. | ||
| export CORE_DUMP_DIR="${CORE_DUMP_DIR:-$RUNDIR}" |
There was a problem hiding this comment.
Are you sure we need both variables?
There was a problem hiding this comment.
Aren't they sensible in case we core dump to debug stuff?
There was a problem hiding this comment.
I means mostly, why do we need 2 variables? But then, searching O2 and O2DPG for CORE_DUMP_DIR, I do not find anything. So by whom is this variable interpreted?
|
Error while checking build/O2/fullCI_slc9 for 4cb5d7c at 2026-06-16 16:00: Full log here. |
|
Btw, why is the script named "gen_single_gpu_rtc_benchmark.sh"? What does "gen" stand for? |
|
And running it, is is not immediately clear how to provide input data. It asks to set the GPUTYPE, but setting only the GPUTYPE, it will just exit immediately again. |
Generate. |
…mple metrics. Additionally adding RTC cache dir
This PR brings two scripts that benchmark the single GPU performance