│ Test │ ──────────────── CPU ──────────────── │
Test (Worker) │ time (s) │ GC (s) │ GC % │ Alloc (MB) │ RSS (MB) │
codegen/assume (16) │ 4.76 │ 0.30 │ 6.3 │ 602.11 │ 1220.62 │
device/gather_scatter (17) │ 6.01 │ failed at 2026-05-30T14:46:10.253
codegen/reflection (19) │ 6.31 │ 0.55 │ 8.8 │ 787.81 │ 1220.62 │
device/control_flow (11) │ 7.34 │ failed at 2026-05-30T14:46:12.197
codegen/rng_intrinsics (16) │ 2.89 │ 0.04 │ 1.3 │ 312.60 │ 1220.62 │
examples/vadd (11) │ 1.25 │ failed at 2026-05-30T14:46:14.242
codegen/cse (17) │ 3.54 │ 0.06 │ 1.6 │ 525.42 │ 1500.14 │
device/types (20) │ 13.28 │ failed at 2026-05-30T14:46:17.623
examples/softmax (19) │ 7.25 │ failed at 2026-05-30T14:46:19.055
examples/batchmatmul (16) │ 5.94 │ failed at 2026-05-30T14:46:19.563
device/math (18) │ 14.77 │ failed at 2026-05-30T14:46:19.774
host/broadcast (12) │ 15.28 │ failed at 2026-05-30T14:46:20.281
examples/matmul (17) │ 5.23 │ failed at 2026-05-30T14:46:20.585
examples/fmha (15) │ 16.92 │ failed at 2026-05-30T14:46:21.095
codegen/kernel_state (17) │ 0.28 │ 0.00 │ 0.0 │ 47.77 │ 1566.61 │
ext/DLFP8TypesExt (11) │ 6.58 │ 0.03 │ 0.4 │ 366.58 │ 1539.62 │
device/slice (19) │ 2.46 │ failed at 2026-05-30T14:46:22.129
device/hints (13) │ 16.91 │ failed at 2026-05-30T14:46:22.547
codegen/slice (20) │ 4.13 │ 0.06 │ 1.4 │ 600.78 │ 1550.84 │
examples/transpose (12) │ 1.56 │ failed at 2026-05-30T14:46:22.752
device/views (16) │ 2.62 │ failed at 2026-05-30T14:46:22.958
codegen/bounds (11) │ 0.79 │ 0.00 │ 0.0 │ 101.62 │ 1539.62 │
device/print (17) │ 0.98 │ failed at 2026-05-30T14:46:23.161
device/integration (19) │ 0.46 │ failed at 2026-05-30T14:46:23.263
types (16) │ 0.51 │ 0.00 │ 0.0 │ 42.23 │ 1546.50 │
device/kernel_state (11) │ 0.42 │ failed at 2026-05-30T14:46:24.378
codegen/views (15) │ 2.29 │ 0.03 │ 1.2 │ 323.52 │ 1510.98 │
analysis/dataflow (12) │ 0.85 │ 0.00 │ 0.0 │ 46.16 │ 1553.40 │
device/broadcast (8) │ 19.81 │ failed at 2026-05-30T14:46:24.985
host/cache (20) │ 1.55 │ 0.00 │ 0.0 │ 176.50 │ 1577.55 │
codegen/no_wrap (13) │ 3.43 │ 0.04 │ 1.3 │ 421.16 │ 1543.97 │
codegen/fpmode (18) │ 6.09 │ 0.10 │ 1.6 │ 618.01 │ 1547.52 │
examples/moe (14) │ 22.68 │ failed at 2026-05-30T14:46:27.818
examples/layernorm (2) │ 23.23 │ failed at 2026-05-30T14:46:28.326
codegen/integration (9) │ 22.98 │ 1.00 │ 4.3 │ 3010.83 │ 1270.38 │
device/atomics (10) │ 23.65 │ failed at 2026-05-30T14:46:29.137
device/reductions (1) │ 25.70 │ failed at 2026-05-30T14:46:29.238
host/mapreduce (4) │ 24.44 │ failed at 2026-05-30T14:46:29.340
device/core (7) │ 27.89 │ failed at 2026-05-30T14:46:32.174
device/tile (5) │ 27.52 │ failed at 2026-05-30T14:46:32.579
examples/fft (17) │ 8.95 │ failed at 2026-05-30T14:46:32.883
codegen/operations (6) │ 39.48 │ 1.58 │ 4.0 │ 6231.41 │ 1346.41 │
CUDA_Compiler_jll v0.4.4 (fails)
Click to view logs
CUDA_Compiler_jll v0.4.3 (passes)
Click to view logs
Tested on cuTile.jl v0.3.0.
Seems to be fine regardless of CUDA_Tile_jll version.