SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
-
Updated
Apr 14, 2026 - Python
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
(one of )The SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Transformers.
AdaLLM is an NVFP4-first inference runtime for Ada Lovelace (RTX 4090) with FP8 KV cache and custom decode kernels. This repo targets NVFP4 weights and keeps the entire decode path in FP8
A production-ready Docker setup for ComfyUI that unlocks the full potential of NVIDIA Blackwell GPUs (RTX 50 series) through 4-bit quantization with NVFP4.
[ACL 2026 Main] Code for the paper "ARCQuant: Boosting NVFP4 Quantization with Augmented Residual Channels for LLMs"
LLM fine-tuning with LoRA + NVFP4/MXFP8 on NVIDIA DGX Spark (Blackwell GB10)
Blackwell-optimized llama.cpp Docker image – works on all NVIDIA GPUs, but tuned for RTX 50 series. Built from scratch with CUDA 12.8, sm_120, NVFP4-ready. 250+ tok/s on 4B F16. Includes llama-chat script.
(Experimental) A high-throughput and memory-efficient inference and serving engine for LLMs optimized for GB10 homelabs
🔧 Fine-tune large language models efficiently on NVIDIA DGX Spark with LoRA adapters and optimized quantization for high performance.
Deploy Nemotron 3 Nano 30B on NVIDIA DGX Spark using TensorRT-LLM (Blackwell GB10, NVFP4 quantization, OpenAI-compatible API)
🚀 Accelerate image generation with ComfyUI's Docker for NVIDIA Blackwell GPUs, optimizing speed and memory usage through NVFP4 support.
Production LLM deployment specs for NVIDIA Blackwell GPUs (RTX Pro 6000, DGX Spark). Includes vLLM configurations, benchmarks, load balancer, and throughput calculators for NVFP4/FP8/MoE models.
Add a description, image, and links to the nvfp4 topic page so that developers can more easily learn about it.
To associate your repository with the nvfp4 topic, visit your repo's landing page and select "manage topics."