Rengu Flow

Preliminary release (v0.4.x) — Rengu Flow is under active development. APIs, config keys, CLI commands, and documentation may change in breaking or non-breaking ways between releases. Pin versions and re-read the docs when upgrading.

A TOML-driven training framework for diffusion models. You describe a run in a config file — model, adapter, optimizer, LR scheduler, dataset, and training options — and Rengu Flow launches it with DeepSpeed. Everything is modular and registry-based: models, adapters, optimizers, and schedulers are selected and configured entirely from TOML, with an optional local web UI on top.

Why Rengu Flow

Config-first — One main TOML points to a dataset TOML and sets model, adapter, optimizer, scheduler, and training options. No code changes to start a run.
Modular & registry-based — Models, adapters, optimizers, and schedulers are pluggable; extend the framework by registering new ones (see architecture).
VRAM-aware — Block swap (CPU offload of transformer blocks), activation checkpointing and offload, optional quantization, gradient release, and OOM-skip let large models train on a single consumer GPU.
Self-managing dependencies — The rengu CLI drives uv: it creates the venv, installs Python, and pulls only the optional extras a given config needs (e.g. Cosmos deps when [model] type = "cosmos_predict2").
External control — Drop signal files in the run directory (save, save_quit, export_model, preview_now, …) to checkpoint, export, preview, or exit cleanly — no API or restart required.
Local web UI — Optional control panel to build configs and datasets, launch and queue runs, send signals, and watch live progress and previews.

Features

Adapters & full finetune — LoRA, LoKr (vendored, ComfyUI/Forge-compatible saves), the full LyCORIS algorithm family, and adapter-free full-model finetuning.
Datasets — Directory datasets with aspect-ratio buckets, multi-resolution resolution schedules, a disk cache (v2) for latents and text embeddings, tag dropout / caption variants, and opt-in augmentation presets.
Dataset Studio (rengu prep) — Auto-tagging, captioning, watermark cleanup, and a bulk tag editor for preparing training data (guide).
Training loop — Periodic evaluation on held-out datasets, image previews during training, min-SNR / debiased loss weighting, EMA, and torch.compile.
Checkpointing & export — Resume checkpoints vs. inference exports, retention limits, scheduled saves, and optional async export (guide).
Experiment tracking — A single sink fans out to a local manifest, TensorBoard, and (opt-in) Weights & Biases.
Optimizers & schedulers — Built-in names, fully-qualified import paths, and vendored/extended optimizers (e.g. Prodigy, Automagic, K-Optimizers) selected from TOML.

Supported models and adapters

Model	Type	Adapters	Notes
Stable Diffusion XL	`sdxl`	LoRA, LoKr, LyCORIS, full finetune	Optional UNet-only via `freeze_text_encoders`.
Cosmos Predict2 / Anima	`cosmos_predict2` (alias `anima`)	LoRA, LoKr, LyCORIS, full finetune	DiT + Wan VAE + Qwen3/T5. Anima checkpoints are this architecture; `type = "anima"` is accepted as a legacy alias. Needs the `cosmos` extra.

Adapter selection. Set [adapter] type = "lora" / "lokr" / a lycoris_* type; omit [adapter] entirely for full-model finetune. The LyCORIS family (requires the lycoris extra) covers: lycoris_locon, lycoris_loha, lycoris_lokr, lycoris_dora, lycoris_dylora, lycoris_glora, lycoris_diag_oft, lycoris_boft. See SDXL training, Cosmos Predict2 / Anima, and full-model finetuning.

Requirements

You install these (system level):

OS — Linux, or Windows via WSL2. Native Windows is not supported for training; see the WSL workflow.
NVIDIA GPU + driver — a CUDA-capable GPU with a driver recent enough for CUDA 13.x (check the "CUDA Version" reported by nvidia-smi).
CUDA Toolkit 13.x — provides nvcc, which DeepSpeed uses to JIT-compile its C++/CUDA ops. Its major version must match the PyTorch build (CUDA 13); without it, DeepSpeed's compiled ops fail to build. Point CUDA_HOME at the toolkit if it is not auto-detected.
uv — required on PATH for ./rengu and ./start-ui.sh. uv creates .venv and installs a compatible Python (3.10–3.13) automatically; no separate system python3 needed.

Installed automatically (by rengu init / uv sync):

PyTorch 2.12 + CUDA 13.0 (torch==2.12.0+cu130), torchvision 0.27, and DeepSpeed 0.19, from the PyTorch cu130 index.
The CUDA 13 runtime stack — cuDNN, cuBLAS, NCCL, cuFFT, cuRAND, … — ships inside those PyTorch wheels (as nvidia-*-cu13 packages). You do not install cuDNN or the runtime libraries separately.
All remaining Python deps (diffusers, PEFT, safetensors, …). Exact versions are pinned in pyproject.toml and uv.lock.

Tested stack (May 2026, WSL2 + NVIDIA): Python 3.13, torch 2.12.0+cu130, torchvision 0.27.0+cu130, deepspeed 0.19.0 — verified end-to-end on an 8 GB RTX 3000 Ada (SDXL + Cosmos LoRA and SDXL full-finetune smokes).

Installation

From the repository root (Linux):

./rengu init          # create rengu.local.toml + uv sync (base training stack)
./rengu init ui       # also install the web UI extra

The ./rengu wrapper runs uv sync on first use, so the venv is built automatically. Install optional extras by listing profiles:

./rengu init cosmos lycoris   # Cosmos Predict2 + LyCORIS adapters
./rengu init all              # every documented extra

Profile	Installs
`base`	Core training stack (default)
`ui`	Local web control panel
`cosmos` / `cosmos_predict2`	Cosmos Predict2 / Anima
`lycoris`	LyCORIS adapter family (incl. LoKr backend)
`optim`	Extended optimizers
`kaon`	K-Optimizers (git-pinned: Adakaon, AdaMuon, KProdigy, …)
`prep`	Dataset Studio (taggers, captioners, watermark cleanup)
`dev`	Test/dev tools
`all`	All of the above

./rengu init --only-config writes rengu.local.toml and directories without syncing. Advanced users can run uv sync themselves and call .venv/bin/rengu directly. Before any train/validate/cache, Rengu Flow inspects the config and auto-installs any missing extras it needs.

Updating

./rengu update          # fast-forward pull, re-sync from uv.lock, rebuild UI if present

rengu update pulls the latest project code, re-syncs dependencies from the lockfile, and recompiles the web UI if it was built locally. It also refreshes the optional profiles you already installed (so git-pinned extras like kaon move to their new commit pin); profiles you never installed are left alone. Useful flags: --all-extras (every documented extra), --no-pull (skip the git pull), and --force (discard local tracked code changes and hard-reset when a fast-forward is blocked — never touches untracked/ignored files, so your UI data dir and jobs.db are safe). Check your version with ./rengu version.

Quick start

Set up the environment:
```
./rengu init
```
Local settings (optional). rengu init creates rengu.local.toml (gitignored) for machine settings — UI host/port, default GPU count, master port, and subprocess env vars. Model checkpoint paths go in the training TOML, not here. See rengu.local.toml.example.
Create a training config from an example:
```
cp examples/minimal_config_lora_sdxl.toml my_train.toml
```
Edit my_train.toml: set the dataset path and [model] paths (e.g. checkpoint_path for SDXL).
Train:
```
./rengu train --config my_train.toml
```
- Validate without training: ./rengu validate --config my_train.toml
- Build the dataset cache only: ./rengu cache --config my_train.toml
- Resume from the latest checkpoint: ./rengu train --config my_train.toml --resume-from-checkpoint
- Run DeepSpeed directly: deepspeed --num_gpus=1 -m rengu_flow.main --config my_train.toml

See the CLI guide for every command, flag, and rengu.local.toml key.

Command-line interface

Command	Description
`rengu init [profiles…]`	Create `rengu.local.toml` + UI data dir, `uv sync` the chosen profiles
`rengu update [profiles…]`	Pull, re-sync from `uv.lock`, refresh installed extras, rebuild UI
`rengu version`	Print Rengu Flow version, git commit, and installed kaon version
`rengu train --config PATH`	Launch a DeepSpeed training run (`--num-gpus`, `--master-port`, `--resume-from-checkpoint`)
`rengu validate --config PATH`	Validate a training config and exit
`rengu cache --config PATH`	Build the dataset cache and exit
`rengu dump-dataset PATH`	Inspect a dataset TOML
`rengu prep <tag\|caption\|clean\|models>`	Dataset Studio: tagging, captioning, watermark cleanup, model list/download
`rengu ui [start\|serve\|dev\|build\|reset-db]`	Run or build the local web UI

Trailing args after -- are forwarded to the trainer (e.g. ./rengu train --config my.toml -- --regenerate_cache). Full flag reference: CLI guide.

Web UI

./rengu ui start

Builds the frontend if needed, serves the API, and opens a browser. The UI lets you edit training configs and datasets, launch and queue runs, send signal files, and watch live progress and previews. Live progress is parsed from the job's stdout — no extra config required. See the Web UI user guide.

Controlling a run

Training reacts to signal files dropped in the run directory (also exposed as buttons in the UI):

Signal	Effect
`save` / `save_quit`	Checkpoint now (and exit).
`export_model` / `export_model_quit`	Export inference weights now (and exit).
`preview_now`	Render the configured preview prompts on the next step.
`continue` / `quit`	Resume after an export-wait pause / exit without saving.

Details and the full list: signal files.

Documentation

User guide — Training each model, dataset config and prep, optimizers, previews, checkpoints, signal files, the CLI, and the web UI. Start at the docs index.
Developer guide — Architecture, the model pipeline contract, adding optimizers/schedulers, networks/adapters, VRAM optimization, and testing.
Implementation backlog — Planned / deferred work.

Status and known limitations

Rengu Flow is a preliminary release under active development; treat config keys and CLI flags as subject to change between versions. A few specific notes:

Linux / WSL2 only. Native Windows is not supported. On WSL2, do not set PYTORCH_CUDA_ALLOC_CONF = "expandable_segments:True" — Rengu Flow detects WSL and applies safe defaults automatically (see the CLI guide).
Built-in models are SDXL and Cosmos Predict2 / Anima. Other architectures (e.g. Flux) are not yet registered — see backlog.
Cosmos load_and_fuse_adapter is intentionally unsupported; load adapter weights instead.

Third-party components

Rengu Flow incorporates and adapts work from several projects. See THIRD_PARTY_NOTICES.md for full notices and licenses:

diffusion-pipe (GPL-3.0) — design and portions of the training flow; vendored optimizers under rengu_flow/vendor/diffusion_pipe_optimizers/.
NVIDIA Cosmos Predict2 (Apache-2.0) — DiT and LLM-adapter modeling code.
Alibaba Wan VAE — used by the Cosmos pipeline.
AI Toolkit / Ostris (MIT) — the Automagic optimizer.

License

Rengu Flow is distributed under the GNU General Public License v3.0 or later (LICENSE). See THIRD_PARTY_NOTICES.md for incorporated components and their licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 404 Commits
docs		docs
examples		examples
rengu_flow		rengu_flow
rengu_flow_ui		rengu_flow_ui
rengu_track		rengu_track
scripts		scripts
tests		tests
ui/web		ui/web
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
BACKLOG.md		BACKLOG.md
LICENSE		LICENSE
README.md		README.md
THIRD_PARTY_NOTICES.md		THIRD_PARTY_NOTICES.md
pyproject.toml		pyproject.toml
rengu		rengu
rengu.local.toml.example		rengu.local.toml.example
start-ui.sh		start-ui.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rengu Flow

Why Rengu Flow

Features

Supported models and adapters

Requirements

Installation

Updating

Quick start

Command-line interface

Web UI

Controlling a run

Documentation

Status and known limitations

Third-party components

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Rengu Flow

Why Rengu Flow

Features

Supported models and adapters

Requirements

Installation

Updating

Quick start

Command-line interface

Web UI

Controlling a run

Documentation

Status and known limitations

Third-party components

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages