docs: troubleshoot CUDA device-LTO elfLink failure on apt nvidia-cuda-toolkit by IvanaGyro · Pull Request #876 · Cytnx-dev/Cytnx

IvanaGyro · 2026-06-03T09:42:36Z

Summary

Adds a Build troubleshooting section to docs/source/adv_install.rst documenting the CUDA device-LTO failure:

nvlink fatal : elfLink linker library load error

This surfaced while preparing a CUDA build on Ubuntu and was initially misattributed to glibc 2.34+ empty stub archives. The doc records the real cause and the fix so others don't chase the same dead end.

What the section explains

Symptom — a -DUSE_CUDA=ON build configures fine but the device-link step aborts with the elfLink error. It only appears because Cytnx enables CMAKE_INTERPROCEDURAL_OPTIMIZATION on non-Apple builds, which (with CUDA separable compilation) turns on CUDA device LTO (nvcc -dlto).
Cause — the Debian/Ubuntu nvidia-cuda-toolkit apt package ships libnvvm.so in /usr/lib/x86_64-linux-gnu/ but not under the toolkit's lib64/, where nvcc points nvlink via -nvvmpath=/usr/lib/nvidia-cuda-toolkit. nvlink then can't open /usr/lib/nvidia-cuda-toolkit/lib64/libnvvm.so. The empty glibc stub archives are explicitly called out as not the cause (nvlink tolerates them).
Fix (recommended) — install a complete CUDA toolkit via conda (conda install -c nvidia cuda) or NVIDIA's installer, which keep libnvvm.so under nvvm/lib64.
Workaround — symlink the packaged libnvvm.so into the path nvlink searches.

Evidence

All verified on glibc 2.39: device LTO links cleanly on conda CUDA 12.0–12.9 (even with empty stubs on the line) and on a standard install; it fails only on the apt-packaged 12.0, with or without stubs. strace confirmed the single failing openat is /usr/lib/nvidia-cuda-toolkit/lib64/libnvvm.so, and adding that one symlink makes the link succeed.

Notes

Docs-only change; no code touched.
Born out of the same investigation as cmake: portable CUDA arch default + require CMake 3.25 for device LTO #875 (the CMake portability + device-LTO PR), but kept separate since it's documentation.
Branched from latest master.

gemini-code-assist

Code Review

This pull request adds a "Build troubleshooting" section to the advanced installation documentation, explaining how to resolve a CUDA device link error (elfLink linker library load error) caused by a layout issue in the Debian/Ubuntu nvidia-cuda-toolkit package. The feedback suggests formatting the shell code blocks with a space after the $ prompt for consistency, and using the unversioned libnvvm.so as the symlink source to make the workaround more robust and future-proof.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

codecov · 2026-06-03T10:00:58Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 29.49%. Comparing base (ee9d856) to head (f0dc9c1).
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #876   +/-   ##
=======================================
  Coverage   29.49%   29.49%           
=======================================
  Files         241      241           
  Lines       35512    35512           
  Branches    14777    14777           
=======================================
  Hits        10475    10475           
  Misses      17784    17784           
  Partials     7253     7253

Flag	Coverage Δ
cpp	`29.09% <ø> (ø)`
python	`52.71% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
C++ backend	`30.74% <ø> (ø)`
Python bindings	`17.09% <ø> (ø)`
Python package	`52.71% <ø> (ø)`

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ee9d856...f0dc9c1. Read the comment docs.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…lkit Add a "Build troubleshooting" section to the advanced install guide covering "nvlink fatal : elfLink linker library load error" during a CUDA-enabled build. On non-Apple builds Cytnx enables CMAKE_INTERPROCEDURAL_OPTIMIZATION, which together with CUDA separable compilation turns on CUDA device LTO (nvcc -dlto). The device link step loads NVVM. The Debian/Ubuntu nvidia-cuda-toolkit apt package installs libnvvm.so under /usr/lib/x86_64-linux-gnu/ but not under the toolkit's lib64 directory, where nvcc points nvlink via -nvvmpath=/usr/lib/nvidia-cuda-toolkit; nvlink then fails to open /usr/lib/nvidia-cuda-toolkit/lib64/libnvvm.so. The empty glibc 2.34+ stub archives (libpthread.a / librt.a / libdl.a) are unrelated and are tolerated by nvlink, so they are explicitly called out as not being the cause. Document the recommended fix (install a complete CUDA toolkit from conda or NVIDIA so libnvvm.so sits in nvvm/lib64) and a symlink workaround for the distribution package. Co-authored-by: Claude <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f0dc9c1986

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-03T16:19:08Z

+.. code-block:: shell
+
+    $sudo mkdir -p /usr/lib/nvidia-cuda-toolkit/lib64
+    $sudo ln -s /usr/lib/x86_64-linux-gnu/libnvvm.so /usr/lib/nvidia-cuda-toolkit/lib64/libnvvm.so


Link to the versioned libnvvm soname

On stock Debian/Ubuntu apt installs, libnvvm4 provides /usr/lib/x86_64-linux-gnu/libnvvm.so.4 and .4.0.0, not an unversioned /usr/lib/x86_64-linux-gnu/libnvvm.so (Ubuntu jammy, Debian sid). In that environment this ln -s command creates a dangling /usr/lib/nvidia-cuda-toolkit/lib64/libnvvm.so, so the documented workaround still leaves nvlink unable to load NVVM. Please point the symlink at the actual versioned file found by find or make the command use that discovered path.

Useful? React with 👍 / 👎.

pcchen · 2026-06-07T02:54:04Z

Review: PR #876 — "docs: troubleshoot CUDA device-LTO elfLink failure on apt nvidia-cuda-toolkit"

Overview

Docs-only addition of a "Build troubleshooting" section covering the nvlink fatal: elfLink linker library load error that hits users building with the Debian/Ubuntu nvidia-cuda-toolkit apt package and CUDA device LTO enabled.

Accuracy

The technical content is correct and well-researched:

Root cause (apt package puts libnvvm.so in /usr/lib/x86_64-linux-gnu/ instead of lib64/) is accurate and precisely described.
"Not the empty glibc stub archives" callout is valuable — this is a common red herring and explicitly ruling it out saves debugging time.
Conda fix and symlink workaround are both sound. The find /usr -name 'libnvvm.so*' tip for non-x86_64 hosts is a good touch.
The comment that regular (non-LTO) device linking doesn't load NVVM correctly explains why this only appears after CMAKE_INTERPROCEDURAL_OPTIMIZATION is on.

Two minor issues

1. Missing space after $ prompt in code blocks

Both shell examples are formatted as $conda ... and $sudo ... (no space), which looks like a variable expansion rather than a shell prompt. The rest of adv_install.rst uses $ command (with a space). Should be:

    $ conda install -c nvidia cuda

    $ sudo mkdir -p /usr/lib/nvidia-cuda-toolkit/lib64
    $ sudo ln -s /usr/lib/x86_64-linux-gnu/libnvvm.so /usr/lib/nvidia-cuda-toolkit/lib64/libnvvm.so

2. Missing third option: disable device LTO

Users who need a quick escape hatch without changing their toolkit installation could simply disable LTO at configure time. Worth adding a brief note after the workaround:

**Alternative: disable device LTO.** If neither option above is practical,
configure with ``-DCMAKE_INTERPROCEDURAL_OPTIMIZATION=OFF`` to skip device LTO
entirely. The build will complete but without link-time optimizations.

RST structure

Section hierarchy (* for level 1, - for subsection) is consistent with the rest of the file. Underline lengths are correct. Blank lines after titles are present. No structural issues.

Summary

Item	Status
Technical accuracy	✅ Correct and well-verified
glibc stub red-herring callout	✅ Valuable addition
RST structure	✅ Consistent
`$conda` / `$sudo` prompt spacing	⚠️ Missing space — should be `$ conda`, `$ sudo`
No mention of `-DCMAKE_INTERPROCEDURAL_OPTIMIZATION=OFF` escape hatch	⚠️ Worth adding

Posted by Claude Code on behalf of @pcchen

gemini-code-assist Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread docs/source/adv_install.rst

Comment thread docs/source/adv_install.rst Outdated

IvanaGyro force-pushed the claude/docs-cuda-lto-troubleshooting branch from 58c6c3b to f0dc9c1 Compare June 3, 2026 11:52

IvanaGyro marked this pull request as ready for review June 3, 2026 16:16

IvanaGyro requested review from manuschneider and pcchen June 3, 2026 16:17

chatgpt-codex-connector Bot reviewed Jun 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: troubleshoot CUDA device-LTO elfLink failure on apt nvidia-cuda-toolkit#876

docs: troubleshoot CUDA device-LTO elfLink failure on apt nvidia-cuda-toolkit#876
IvanaGyro wants to merge 1 commit into
masterfrom
claude/docs-cuda-lto-troubleshooting

IvanaGyro commented Jun 3, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Uh oh!

pcchen commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

IvanaGyro commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What the section explains

Evidence

Notes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

pcchen commented Jun 7, 2026

Review: PR #876 — "docs: troubleshoot CUDA device-LTO elfLink failure on apt nvidia-cuda-toolkit"

Overview

Accuracy

Two minor issues

RST structure

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

IvanaGyro commented Jun 3, 2026 •

edited

Loading

codecov Bot commented Jun 3, 2026 •

edited

Loading