Skip to content

P2P shows CNS on kernel 7.0.0-3-pve (Debian 13) with 2× RTX 3090 #24

@svilendotorg

Description

@svilendotorg

Problem

P2P shows CNS (Chipset Not Supported) on Linux kernel 7.0.0-3-pve (Proxmox VE / Debian 13) even with the patched P2P modules. The same patch works perfectly on 6.17.13-6-pve on identical hardware.

Hardware

Component Detail
CPU AMD EPYC 7443P (24C/48T, PCIe 4.0)
Mobo Supermicro MBD-H12SSL-CT (SP3, ATX)
GPU #1 RTX 3090 @ PCIe x16 (01:00.0, Gen4 x16)
GPU #2 RTX 3090 @ PCIe x16 (c1:00.0, Gen4 x16)
OS Proxmox VE (Debian 13)

Driver & Kernel

Item Value
Driver 595.58.03
CUDA 13.2.1
Branch 595.58.03-p2p
Working kernel 6.17.13-6-pve ✅ (P2P OK)
Broken kernel 7.0.0-3-pve ❌ (P2P CNS)

Reproduction

# Install from the p2p branch
git clone -b 595.58.03-p2p --single-branch https://github.com/aikitoria/open-gpu-kernel-modules
cd open-gpu-kernel-modules && ./install.sh

# Check P2P
nvidia-smi topo -p2p r

On 6.17.13-6-pve → P2P reads/writes show OK.
On 7.0.0-3-pve → P2P reads/writes show CNS.

Additional Notes

  • Kernel 7.0.0 also ships nova_core (NVIDIA's open-source Rust GPU driver) that auto-binds to one of the two GPUs, causing 0xbadf5720 PCIe register errors. This was worked around with modprobe blacklisting (blacklist nova + blacklist nova_core), but the P2P CNS issue persists even after that is fixed.
  • Both GPUs are on independent PCIe lanes (GPP0 and GPP3) — no shared switch.
  • Full system details available on request.

Expected Behavior

P2P should show OK on 7.0.0-3-pve the same as on 6.17.13-6-pve.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions