gb10

Star

Here are 23 public repositories matching this topic...

eelbaz / dgx-spark-vllm-setup

Star

One-command vLLM installation for NVIDIA DGX Spark with Blackwell GB10 GPUs (sm_121 architecture)

machine-learning ai deep-learning gpu cuda pytorch nvidia arm64 blackwell llm vllm llm-inference gb10 dgx-spark

Updated Oct 28, 2025
Shell

jdaln / dgx-spark-inference-stack

Star

Serve the home! Inference stack for your Nvidia DGX Spark aka the Grace Blackwell AI supercomputer on your desk. Mostly vLLM based for now and single-spark. For the not-so-rich buddies

docker docker-compose cuda inference self-hosted llama model-serving mlops dgx generative-ai local-llm gb10 dgx-spark

Updated Apr 15, 2026
JavaScript

seanGSISG / dgx-spark-sunshine-setup

Star

headless remote desktop to your dgx spark in crystal clear 4k

remote-desktop remote-access sunshine dgx gb10 dgx-spark

Updated Apr 5, 2026
Shell

getainode / ainode

Star

Turn any NVIDIA GPU into a local AI platform. Inference + fine-tuning in your browser. One command to start, automatic clustering.

open-source gpu cuda inference self-hosted distributed nvidia fine-tuning ai-platform llm vllm local-ai gb10 dgx-spark grace-blackwell

Updated Apr 17, 2026
Python

calico88x / DGX-Model-Manager

Star

A lightweight web UI for managing AI models on the NVIDIA DGX Spark. Pull Ollama models, download from HuggingFace, manage LiteLLM routing, and control SGLang or vLLM — all from one browser tab.

web ai nvidia model-deployment fastapi ai-tools llm llm-tools gb10 dgx-spark dgxspark

Updated Apr 12, 2026
Python

Navi-AI-Lab / nvllm

Star

(Experimental) A high-throughput and memory-efficient inference and serving engine for LLMs optimized for GB10 homelabs

nvidia cuda-kernels cutlass local-inference vllm llm-inference qwen paged-attention self-hosted-ai gb10 sm120 nvfp4 dgx-spark fp4-quantization attention-kernel fp8-kv-cache

Updated Apr 17, 2026
Python

jxlarrea / homeassistant-voice-recipes

Sponsor

Star

GPU/CUDA-accelerated voice control stack for Home Assistant. Runs on x86/x64 and ARM64 (including the NVIDIA DGX Spark). 100% Local - No Cloud, No Subscriptions.

text-to-speech x86-64 cuda gpu-acceleration home-assistant speech-to-text arm64 voice-assistant local-llm qwen3 gb10 dgx-spark

Updated Apr 16, 2026
Go

scottgl9 / sglang-spark-gb10-optimizations

Sponsor

Star

SGLang optimizations for NVIDIA Spark (GB10) — SM121 Grace Blackwell

optimization marlin sglang gb10

Updated Apr 16, 2026
Python

parallelArchitect / spark-gpu-throttle-check

Star

Enhanced GPU throttle diagnostic for DGX Spark (GB10): NVML direct telemetry, throttle cause decoder, PCIe link monitoring, baseline drift detection, timeline capture.

cuda cublas nvidia nvml pcie usb-pd gpu-monitoring power-delivery gb10 gpu-diagnostics dgx-spark throttle-detection clock-throttling

Updated Mar 22, 2026
Python

ridanuae / dgx-spark-sglang-qwen35

Star

Run Qwen3.5-35B-A3B on NVIDIA DGX Spark (GB10) with SGLang - Ready-to-use Docker image + complete guide

docker nvidia moe blackwell llm sglang qwen3 gb10 dgx-spark qwen35

Updated Feb 26, 2026
Shell

leap21ai / autospark

Star

DGX Spark (GB10/SM121) platform support for Meta's KernelAgent — auto-detect, hardware constraints, safe Triton configs

cuda nvidia triton gpu-optimization gb10 dgx-spark sm121 kernel-agent

Updated Mar 14, 2026
Python

kiisu-dsalyss / recipe-deck

Star

DGX Spark / GB10 Family - Spark-VLLM-Docker frontend

ai nvidia-gpu ai-model ai-models vllm gb10 dgx-spark

Updated Apr 8, 2026
TypeScript

ogulcanaydogan / dgx-spark-llm-stack

Star

Pre-built PyTorch wheels and build scripts for NVIDIA DGX Spark (GB10, sm_121, Blackwell, CUDA 13.0, ARM64)

machine-learning deep-learning gpu cuda inference pytorch nvidia arm64 aarch64 fine-tuning blackwell llm gb10 dgx-spark grace-blackwell sm121 cuda-13 pre-built-wheels

Updated Apr 14, 2026
Shell

pureGavin / unsloth4arm

Star

This project is the ARM architecture version of unsloth.

arm fine-tuning unsloth gb10

Updated Jan 9, 2026
Dockerfile

parallelArchitect / dgx-forensic-collect

Star

Targeted forensic data collector for NVIDIA DGX Spark (GB10) systems

gpu cuda forensics nvidia diagnostics gb10 dgx-spark

Updated Apr 14, 2026
Shell

Mr-TalhaIlyas / DGX-Spark-vLLM-Setup

Star

vLLM installation for NVIDIA DGX Spark with Blackwell GB10 GPUs

gpu vllm llm-inference gb10 blackwell-gpu dgx-spark gb-10

Updated Mar 11, 2026

parallelArchitect / nvidia-uma-fault-probe

Star

Cycle-accurate UMA fault latency and bandwidth measurement for NVIDIA GPUs. C and PTX. No Python. Pascal (SM 6.0) through Blackwell GB10 (SM 12.1).

pascal cuda nvidia bandwidth aarch64 rtx uma ptx gpu-performance memory-profiling cuda-c unified-memory gb10 gpu-diagnostics dgx-spark sm121 rrace-blackwell gpu-proformance fault-latency

Updated Apr 12, 2026
Cuda

scottski78 / gb10-nccl-switched-fabric

Star

practical guide to multi-node NCCL over switched RoCE fabric on NVIDIA GB10 (DGX Spark class) — documenting the gaps in NVIDIA's official playbooks

mikrotik nvidia roce nccl vllm distributed-inference gb10 dgx-spark connectx-7

Updated Apr 3, 2026

Diligent-battledamage421 / Inference-Stack

Star

Deliver a scalable LLM inference API in TypeScript and Python with GPU scheduling, dynamic batching, and multi-modal support for production use.

docker cloud ai docker-compose japanese chatbot cuda english cloud-native cantonese cross-lingual fine-tuning mlops fastapi audio-generation chatgpt local-llm cosyvoice gb10

Updated Apr 16, 2026
TypeScript

atcuality2021 / manthanquant

Star

3-bit Lloyd-Max KV Cache Compression for LLM Inference on NVIDIA DGX Spark GB10 — 5.12x compression, 0.983 cosine similarity, pure numpy on ARM unified memory

compression numpy transformers quantization lloyd-max kv-cache unified-memory vllm llm-inference vibe-coding claude-code gb10 nvidia-dgx-spark arm-aarch64

Updated Apr 3, 2026
Python

Improve this page

Add a description, image, and links to the gb10 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gb10 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gb10

Here are 23 public repositories matching this topic...

eelbaz / dgx-spark-vllm-setup

jdaln / dgx-spark-inference-stack

seanGSISG / dgx-spark-sunshine-setup

getainode / ainode

calico88x / DGX-Model-Manager

Navi-AI-Lab / nvllm

jxlarrea / homeassistant-voice-recipes

scottgl9 / sglang-spark-gb10-optimizations

parallelArchitect / spark-gpu-throttle-check

ridanuae / dgx-spark-sglang-qwen35

leap21ai / autospark

kiisu-dsalyss / recipe-deck

ogulcanaydogan / dgx-spark-llm-stack

pureGavin / unsloth4arm

parallelArchitect / dgx-forensic-collect

Mr-TalhaIlyas / DGX-Spark-vLLM-Setup

parallelArchitect / nvidia-uma-fault-probe

scottski78 / gb10-nccl-switched-fabric

Diligent-battledamage421 / Inference-Stack

atcuality2021 / manthanquant

Improve this page

Add this topic to your repo