Skip to content

Releases: JuliaGPU/GPUCompiler.jl

v1.15.1

02 Jun 11:18
3af18ae

Choose a tag to compare

GPUCompiler v1.15.1

Diff since v1.15.0

Merged pull requests:

  • Metal: extend address-space inference to integer pointer arguments (#819) (@maleadt)

v1.15.0

02 Jun 08:07
1535f33

Choose a tag to compare

GPUCompiler v1.15.0

Diff since v1.14.0

Merged pull requests:

  • Metal: include the debug level in the runtime slug (#817) (@maleadt)
  • Add debug level to the compiler options (#818) (@maleadt)

v1.14.0

01 Jun 14:28
d76e0c4

Choose a tag to compare

GPUCompiler v1.14.0

Diff since v1.13.3

Merged pull requests:

  • Gate VectorCombinePass on can_vectorize(job) (#814) (@michel2323)
  • Metal: extend address-space inference across call arguments (#816) (@maleadt)

Closed issues:

  • VectorCombinePass in buildScalarOptimizerPipeline produces invalid SPIR-V on OpenCLBackend (#813)
  • emit_trap removed (#815)

v1.13.3

28 May 08:42
4c650f2

Choose a tag to compare

GPUCompiler v1.13.3

Diff since v1.13.2

Merged pull requests:

  • Metal: turn device-exception traps into returns to avoid GPU hangs (#810) (@maleadt)
  • PTX: Fix regression in kernel metadata on LLVM 20. (#811) (@maleadt)
  • Unify unreachable control flow lowering across back-ends (#812) (@maleadt)

v1.13.2

26 May 16:20
19c5a3f

Choose a tag to compare

GPUCompiler v1.13.2

Diff since v1.13.1

Merged pull requests:

  • Improve the pipeline with callbacks (#785) (@maleadt)
  • Clean-up some FileCheck invocations. (#806) (@maleadt)
  • PTX: add PTXRSqrtFastPass to fold afn 1/sqrt(x) to nvvm.rsqrt.approx (#807) (@maleadt)
  • Make gpu_* runtime stubs CPU-AOT-safe via weak linkage. (#808) (@maleadt)
  • Strip back-end runtime stub bodies before InternalizePass. (#809) (@maleadt)

Closed issues:

  • PTX: investigate if custom NVVM Reflect pass is still needed (#547)

v1.13.1

20 May 14:31
f7d7418

Choose a tag to compare

GPUCompiler v1.13.1

Diff since v1.13.0

v1.13.0

20 May 11:31
eded413

Choose a tag to compare

GPUCompiler v1.13.0

Diff since v1.12.0

Merged pull requests:

  • Add PTXFDivFastPass to lower fdiv fast to NVPTX approximate division (#800) (@vchuravy)
  • Register deferred_codegen in GlobalJD on Julia 1.14+. (#802) (@maleadt)
  • Fixes for nightly (#803) (@maleadt)
  • Add a pass to apply fastmath attributes. (#804) (@maleadt)

v1.12.0

19 May 10:25
ef59201

Choose a tag to compare

GPUCompiler v1.12.0

Diff since v1.11.1

Merged pull requests:

  • Bump julia-actions/setup-julia from 2 to 3 (#786) (@dependabot[bot])
  • PTX: Add support for selecting the GPU feature set. (#798) (@maleadt)

v1.11.1

14 May 07:33
8ebf5bd

Choose a tag to compare

GPUCompiler v1.11.1

Diff since v1.11.0

Merged pull requests:

  • Revert "Fix always_inline via inlining policy override" (#797) (@maleadt)

Closed issues:

  • Pass registration broken on recent release? (#789)

v1.11.0

13 May 09:18
15d4209

Choose a tag to compare

GPUCompiler v1.11.0

Diff since v1.10.0

Merged pull requests:

  • Turn instcombine into a flag (and use instsimplify otherwise) (#791) (@wsmoses)
  • Adjust to upcoming sparam_vals representation change (#792) (@Keno)
  • Consider assertions for runtime module (#793) (@vchuravy)
  • Fix always_inline via inlining policy override (#795) (@maleadt)
  • Adapt to LLVM.jl change. (#796) (@maleadt)

Closed issues:

  • always_inline is broken on 1.11 (#527)