Skip to content

Conversation

@SiorMeir
Copy link
Collaborator

@SiorMeir SiorMeir commented Feb 1, 2026

Description

Small fix that will ensure that DRA enabled GPU resources will be considered designating nodes as cpu-only nodes.

Related Issues

Fixes #

Checklist

Note: Ensure your PR title follows the Conventional Commits format (e.g., feat(scheduler): add new feature)

  • Self-reviewed
  • Added/updated tests (if needed)
  • Updated documentation (if needed)

Breaking Changes

Additional Notes

@github-actions
Copy link

github-actions bot commented Feb 1, 2026

📊 Performance Benchmark Results

Comparing PR (siormeir/DRA-with-cpu-only-nodes) vs main branch:

goos: linux
goarch: amd64
pkg: github.com/NVIDIA/KAI-scheduler/pkg/scheduler/actions
cpu: AMD EPYC 7763 64-Core Processor                
                                    │ main-bench.txt │            pr-bench.txt            │
                                    │     sec/op     │    sec/op     vs base              │
AllocateAction_SmallCluster-4           108.3m ±  0%   108.6m ±  6%       ~ (p=0.485 n=6)
AllocateAction_MediumCluster-4          136.7m ±  2%   136.5m ±  2%       ~ (p=0.699 n=6)
AllocateAction_LargeCluster-4           227.4m ± 12%   223.3m ± 14%       ~ (p=0.818 n=6)
ReclaimAction_SmallCluster-4            102.8m ±  0%   102.7m ±  0%  -0.08% (p=0.041 n=6)
ReclaimAction_MediumCluster-4           105.5m ±  0%   105.4m ±  0%       ~ (p=0.132 n=6)
PreemptAction_SmallCluster-4            103.6m ±  0%   103.6m ±  0%       ~ (p=0.394 n=6)
PreemptAction_MediumCluster-4           112.9m ±  0%   112.9m ±  0%       ~ (p=0.240 n=6)
ConsolidationAction_SmallCluster-4      113.8m ±  0%   113.8m ±  0%       ~ (p=0.589 n=6)
ConsolidationAction_MediumCluster-4     199.7m ±  1%   199.7m ±  1%       ~ (p=0.937 n=6)
FullSchedulingCycle_SmallCluster-4      105.2m ±  0%   105.2m ±  0%       ~ (p=0.310 n=6)
FullSchedulingCycle_MediumCluster-4     119.8m ±  0%   119.3m ±  0%  -0.46% (p=0.004 n=6)
FullSchedulingCycle_LargeCluster-4      157.5m ±  1%   158.4m ±  1%       ~ (p=0.240 n=6)
ManyQueues_MediumCluster-4              139.6m ±  0%   140.7m ±  1%  +0.78% (p=0.002 n=6)
GangScheduling_MediumCluster-4          156.3m ±  2%   158.5m ±  3%       ~ (p=0.310 n=6)
geomean                                 130.6m         130.6m        +0.03%

                                    │ main-bench.txt │            pr-bench.txt            │
                                    │      B/op      │     B/op      vs base              │
AllocateAction_SmallCluster-4           2.152Mi ± 0%   2.152Mi ± 1%       ~ (p=0.818 n=6)
AllocateAction_MediumCluster-4          11.84Mi ± 0%   11.84Mi ± 0%       ~ (p=0.818 n=6)
AllocateAction_LargeCluster-4           41.54Mi ± 0%   41.54Mi ± 0%       ~ (p=0.310 n=6)
ReclaimAction_SmallCluster-4            888.9Ki ± 1%   887.2Ki ± 1%       ~ (p=0.589 n=6)
ReclaimAction_MediumCluster-4           2.832Mi ± 0%   2.830Mi ± 0%       ~ (p=0.394 n=6)
PreemptAction_SmallCluster-4            1.007Mi ± 1%   1.005Mi ± 1%       ~ (p=0.699 n=6)
PreemptAction_MediumCluster-4           4.020Mi ± 0%   4.018Mi ± 0%       ~ (p=0.818 n=6)
ConsolidationAction_SmallCluster-4      5.608Mi ± 0%   5.605Mi ± 0%       ~ (p=0.589 n=6)
ConsolidationAction_MediumCluster-4     46.88Mi ± 0%   46.88Mi ± 0%       ~ (p=0.937 n=6)
FullSchedulingCycle_SmallCluster-4      1.372Mi ± 0%   1.373Mi ± 1%       ~ (p=0.310 n=6)
FullSchedulingCycle_MediumCluster-4     6.836Mi ± 0%   6.836Mi ± 0%       ~ (p=1.000 n=6)
FullSchedulingCycle_LargeCluster-4      22.83Mi ± 0%   22.83Mi ± 0%       ~ (p=0.699 n=6)
ManyQueues_MediumCluster-4              16.31Mi ± 0%   16.31Mi ± 0%       ~ (p=1.000 n=6)
GangScheduling_MediumCluster-4          17.17Mi ± 0%   17.17Mi ± 0%       ~ (p=0.937 n=6)
geomean                                 6.330Mi        6.328Mi       -0.03%

                                    │ main-bench.txt │           pr-bench.txt            │
                                    │   allocs/op    │  allocs/op   vs base              │
AllocateAction_SmallCluster-4            36.21k ± 0%   36.20k ± 0%       ~ (p=0.857 n=6)
AllocateAction_MediumCluster-4           325.2k ± 0%   325.2k ± 0%       ~ (p=0.775 n=6)
AllocateAction_LargeCluster-4            1.394M ± 0%   1.394M ± 0%       ~ (p=0.584 n=6)
ReclaimAction_SmallCluster-4             8.396k ± 0%   8.395k ± 0%       ~ (p=0.526 n=6)
ReclaimAction_MediumCluster-4            26.54k ± 0%   26.54k ± 0%       ~ (p=0.626 n=6)
PreemptAction_SmallCluster-4             11.19k ± 0%   11.19k ± 0%       ~ (p=0.861 n=6)
PreemptAction_MediumCluster-4            38.77k ± 0%   38.77k ± 0%       ~ (p=0.753 n=6)
ConsolidationAction_SmallCluster-4       73.56k ± 0%   73.56k ± 0%       ~ (p=0.485 n=6)
ConsolidationAction_MediumCluster-4      685.8k ± 0%   685.8k ± 0%       ~ (p=0.937 n=6)
FullSchedulingCycle_SmallCluster-4       21.36k ± 0%   21.37k ± 0%       ~ (p=0.361 n=6)
FullSchedulingCycle_MediumCluster-4      174.7k ± 0%   174.7k ± 0%       ~ (p=0.900 n=6)
FullSchedulingCycle_LargeCluster-4       727.2k ± 0%   727.3k ± 0%       ~ (p=0.665 n=6)
ManyQueues_MediumCluster-4               363.3k ± 0%   363.3k ± 0%       ~ (p=0.937 n=6)
GangScheduling_MediumCluster-4           597.0k ± 0%   597.0k ± 0%       ~ (p=0.846 n=6)
geomean                                  111.7k        111.7k       -0.00%

Legend

  • 📉 Negative delta = Performance improvement (faster)
  • 📈 Positive delta = Performance regression (slower)
  • p-value < 0.05 indicates statistically significant change
Raw benchmark data

PR branch:

goos: linux
goarch: amd64
pkg: github.com/NVIDIA/KAI-scheduler/pkg/scheduler/actions
cpu: AMD EPYC 7763 64-Core Processor                
BenchmarkAllocateAction_SmallCluster-4         	       9	 114666932 ns/op	 2280381 B/op	   36216 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108479983 ns/op	 2257838 B/op	   36210 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108655073 ns/op	 2256080 B/op	   36202 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108230618 ns/op	 2254856 B/op	   36201 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 107913013 ns/op	 2258313 B/op	   36207 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 109243843 ns/op	 2254618 B/op	   36200 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 136381363 ns/op	12422840 B/op	  325202 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 136838408 ns/op	12419149 B/op	  325189 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 136499785 ns/op	12417733 B/op	  325196 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 137136249 ns/op	12417353 B/op	  325192 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 134229724 ns/op	12416294 B/op	  325188 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 136427380 ns/op	12416278 B/op	  325183 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 221575577 ns/op	43559011 B/op	 1394305 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 222988347 ns/op	43557432 B/op	 1394293 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 253554842 ns/op	43557281 B/op	 1394289 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 226444865 ns/op	43556452 B/op	 1394290 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 223678771 ns/op	43555881 B/op	 1394291 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 219738531 ns/op	43556115 B/op	 1394276 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102597786 ns/op	  905168 B/op	    8365 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102693501 ns/op	  907308 B/op	    8387 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102744068 ns/op	  909701 B/op	    8395 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102743757 ns/op	  910186 B/op	    8396 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102804786 ns/op	  906300 B/op	    8394 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102793069 ns/op	  918952 B/op	    8397 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105445938 ns/op	 2969447 B/op	   26540 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105357675 ns/op	 2961567 B/op	   26536 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105440468 ns/op	 2965434 B/op	   26537 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105364281 ns/op	 2969431 B/op	   26539 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105414214 ns/op	 2969392 B/op	   26539 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105366907 ns/op	 2961484 B/op	   26536 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103589949 ns/op	 1052008 B/op	   11187 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103687244 ns/op	 1055484 B/op	   11186 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103485199 ns/op	 1051831 B/op	   11187 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103627704 ns/op	 1055527 B/op	   11186 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103615947 ns/op	 1062674 B/op	   11190 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103382757 ns/op	 1051903 B/op	   11187 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112810188 ns/op	 4215311 B/op	   38772 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112985170 ns/op	 4210962 B/op	   38770 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 113042248 ns/op	 4210507 B/op	   38768 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 113040086 ns/op	 4215211 B/op	   38771 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112883792 ns/op	 4206331 B/op	   38767 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112913462 ns/op	 4215416 B/op	   38773 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113385956 ns/op	 5877108 B/op	   73571 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113781724 ns/op	 5872319 B/op	   73537 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113850633 ns/op	 5877497 B/op	   73580 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113769519 ns/op	 5878217 B/op	   73499 allocs/op

Main branch:

goos: linux
goarch: amd64
pkg: github.com/NVIDIA/KAI-scheduler/pkg/scheduler/actions
cpu: AMD EPYC 7763 64-Core Processor                
BenchmarkAllocateAction_SmallCluster-4         	      10	 108320352 ns/op	 2255974 B/op	   36207 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108556789 ns/op	 2256965 B/op	   36205 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108024378 ns/op	 2256341 B/op	   36207 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108333773 ns/op	 2257010 B/op	   36204 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108576521 ns/op	 2255583 B/op	   36204 allocs/op
BenchmarkAllocateAction_SmallCluster-4         	      10	 108341310 ns/op	 2257035 B/op	   36207 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 136944984 ns/op	12420661 B/op	  325201 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 136068427 ns/op	12417011 B/op	  325190 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 139561596 ns/op	12416939 B/op	  325190 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 137625391 ns/op	12416601 B/op	  325190 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 136412505 ns/op	12415869 B/op	  325184 allocs/op
BenchmarkAllocateAction_MediumCluster-4        	       8	 135126195 ns/op	12421628 B/op	  325201 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 235913500 ns/op	43557646 B/op	 1394299 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 218818501 ns/op	43553616 B/op	 1394265 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 213891765 ns/op	43556940 B/op	 1394291 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       5	 209167451 ns/op	43556902 B/op	 1394291 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       4	 250511444 ns/op	43554746 B/op	 1394284 allocs/op
BenchmarkAllocateAction_LargeCluster-4         	       4	 254384789 ns/op	43554378 B/op	 1394282 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102858547 ns/op	  905156 B/op	    8365 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102800264 ns/op	  906962 B/op	    8387 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102918076 ns/op	  910280 B/op	    8396 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102776408 ns/op	  910175 B/op	    8396 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102784092 ns/op	  914165 B/op	    8397 allocs/op
BenchmarkReclaimAction_SmallCluster-4          	      10	 102889771 ns/op	  919114 B/op	    8398 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105628679 ns/op	 2969426 B/op	   26539 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105221270 ns/op	 2969388 B/op	   26539 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105520553 ns/op	 2969482 B/op	   26539 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105648517 ns/op	 2965440 B/op	   26538 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105434643 ns/op	 2969524 B/op	   26540 allocs/op
BenchmarkReclaimAction_MediumCluster-4         	      10	 105478886 ns/op	 2961578 B/op	   26536 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103741068 ns/op	 1055927 B/op	   11188 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103593918 ns/op	 1047949 B/op	   11185 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103621448 ns/op	 1055876 B/op	   11188 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103673668 ns/op	 1052020 B/op	   11187 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103617689 ns/op	 1062327 B/op	   11188 allocs/op
BenchmarkPreemptAction_SmallCluster-4          	      10	 103560291 ns/op	 1055367 B/op	   11186 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112943635 ns/op	 4215033 B/op	   38770 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112340562 ns/op	 4215142 B/op	   38770 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112942392 ns/op	 4215309 B/op	   38772 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 113006890 ns/op	 4214797 B/op	   38769 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112645141 ns/op	 4206459 B/op	   38768 allocs/op
BenchmarkPreemptAction_MediumCluster-4         	       9	 112778132 ns/op	 4210939 B/op	   38770 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113561530 ns/op	 5884336 B/op	   73630 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113955280 ns/op	 5876344 B/op	   73573 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113891354 ns/op	 5873097 B/op	   73552 allocs/op
BenchmarkConsolidationAction_SmallCluster-4    	       9	 113796075 ns/op	 5884048 B/op	   73555 allocs/op

@github-actions
Copy link

github-actions bot commented Feb 1, 2026

Merging this branch will increase overall coverage

Impacted Packages Coverage Δ 🤖
github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/node_info 69.70% (+0.46%) 👍
github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/pod_info 60.00% (+1.03%) 👍
github.com/NVIDIA/KAI-scheduler/pkg/scheduler/plugins/resourcetype 90.00% (ø)

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/node_info/node_info.go 67.33% (+0.67%) 300 202 (+2) 98 (-2) 👍
github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info.go 51.18% (+1.57%) 127 65 (+2) 62 (-2) 👍

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/node_info/node_info_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/scheduler/plugins/resourcetype/resourcetype_test.go

@github-actions
Copy link

github-actions bot commented Feb 1, 2026

Merging this branch will increase overall coverage

Impacted Packages Coverage Δ 🤖
github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/node_info 69.70% (+0.46%) 👍
github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/pod_info 60.00% (+1.03%) 👍
github.com/NVIDIA/KAI-scheduler/pkg/scheduler/plugins/resourcetype 90.00% (ø)

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/node_info/node_info.go 67.33% (+0.67%) 300 202 (+2) 98 (-2) 👍
github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info.go 51.18% (+1.57%) 127 65 (+2) 62 (-2) 👍

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/node_info/node_info_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/scheduler/api/pod_info/pod_info_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/scheduler/plugins/resourcetype/resourcetype_test.go

@SiorMeir SiorMeir marked this pull request as ready for review February 1, 2026 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants