ZIT-Ideogram

ComfyUI custom node that reuses the KJNodes Ideogram 4 visual box editor pattern for Z-Image-Turbo regional prompting.

This node is usable for Z-Image-Turbo regional prompting and image-to-image regional edits, but it cannot make Z-Image-Turbo behave exactly like Ideogram's native regional API. ZIT was not built specifically for hard box-constrained regional generation, so final adherence still depends on the model, prompt, mask size, denoise, and workflow.

Node

Z-Image-Turbo Region Builder KJ

Outputs:

positive: the positive Z-Image conditioning. Connect this to the sampler positive input.
negative: the negative Z-Image conditioning. Connect this to the sampler negative input.
latent_with_noise_mask: latent output for the sampler. For text-to-image it is an empty latent. For image-to-image, if image and vae are connected, it is the encoded source image latent with combined_mask attached as noise_mask.
combined_mask: one mask made from all drawn boxes. White/bright areas are editable, black areas should stay preserved. Usually only needed if your workflow has a separate mask/inpaint input.
region_masks: separate masks for every drawn box, shape N,H,W. Mostly for debugging, previewing, or advanced workflows that process each region separately.
source_image: the input image passed through. If no image is connected, this is a blank image. Optional; use it only if another node needs the original image.
preview: visual preview image with boxes and mask tint. Optional; useful with Preview Image.
regions_json: debug text showing parsed regions and the final composed positive prompt. Optional; useful for checking what the node actually sent to CLIP.
bboxes: bounding boxes in pixel coordinates. Optional; only useful for nodes that accept ComfyUI BBOX data.
width / height: passthrough image dimensions. Optional; useful if another node needs the same size.

Installation

Put this folder in ComfyUI/custom_nodes/ZIT-Ideogram.
Restart ComfyUI.
Add node: ZIT-Ideogram/Z-Image > Z-Image-Turbo Region Builder KJ.

No extra Python packages are required.

Usage

Use ComfyUI's built-in Z-Image-Turbo workflow as the base graph, then replace the normal text encoders with this node:

Connect Z-Image/Qwen CLIP to clip.
Connect positive and negative to the sampler.
For text-to-image, connect latent_with_noise_mask to the sampler latent input, or connect width, height, and batch_size to your own latent/image size nodes.
For image-to-image regional editing, connect the source image to image, connect the same VAE used by the workflow to vae, then connect latent_with_noise_mask to the sampler latent input.
Draw boxes in the editor and set each region prompt, optional region negative prompt, strength, and feather.

The node does not call any Ideogram API and does not emit Ideogram caption JSON. It produces native ComfyUI conditioning and masks.

Basic Workflows

Text-to-image

Minimum wiring:

clip input <- your Z-Image/Qwen CLIP.
positive output -> sampler positive.
negative output -> sampler negative.
latent_with_noise_mask output -> sampler latent input.

You can ignore region_masks, combined_mask, source_image, regions_json, bboxes, width, and height for a simple text-to-image workflow.

Image-to-image regional edit

Recommended wiring:

Load or provide an image.
Connect that image to this node's image input.
Connect the workflow VAE to this node's vae input.
Connect positive -> sampler positive.
Connect negative -> sampler negative.
Connect latent_with_noise_mask -> sampler latent input.
Set sampler denoise around 0.6 to 0.8 as a starting point.

With this wiring, the node encodes the source image and attaches the drawn boxes as the latent noise mask. The sampler should mainly change the masked regions. You normally do not need to connect combined_mask separately in this setup.

Use combined_mask separately only if your graph has a dedicated mask/inpaint input, for example an inpaint conditioning node, mask preview node, or a workflow that expects an external mask in addition to the latent.

Z-Image-Turbo Notes

Z-Image-Turbo does not natively consume Ideogram-style bounding-box caption JSON. The editor boxes are converted into ComfyUI masks and prompt text.

conditioning_mode:

single_prompt_fast: recommended default. Region prompts are folded into one Z-Image prompt and masks are output separately. This keeps sampling speed close to a normal Z-Image workflow.
regional_conditioning_slow: experimental for Z-Image-Turbo. It emits one masked conditioning per region, can multiply sampler work by the number of regions, and often does not improve adherence because ZIT is not designed like Ideogram's API regional editor.

For text-to-image, region text is treated as additional positive prompt text with a rough area hint. Z-Image-Turbo may still ignore the rectangle or move the concept because it does not natively support hard regional prompt boxes.

For image-to-image edits, connect image and vae, then use latent_with_noise_mask as the sampler latent. The node's batch_size repeats a single source latent/mask when you want multiple variations. In testing, KSampler denoise around 0.6 to 0.8 is usually the practical range for visible regional changes while preserving the rest of the image. For changes like clothing replacement, draw the region larger than the exact clothing item so the model has enough context to rebuild edges, folds, and nearby transitions. Preservation still depends on Z-Image-Turbo, denoise, mask feather, region size, and the workflow; it is not equivalent to Ideogram 4's API-level regional editor.

default_feather controls mask edge softness in pixels. 0 is a hard rectangle edge, 8-24 is a normal soft edge, and 32+ creates a very broad transition.

default_region_strength controls mask opacity/weight for generated region masks and latent_with_noise_mask. In single_prompt_fast it does not make the text prompt stronger. For img2img edits, keep it near 1.0 unless you intentionally want a weaker/noisier mask edge effect; use sampler denoise for edit intensity.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
fonts		fonts
nodes		nodes
web/js		web/js
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ZIT-Ideogram

Node

Installation

Usage

Basic Workflows

Text-to-image

Image-to-image regional edit

Z-Image-Turbo Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ZIT-Ideogram

Node

Installation

Usage

Basic Workflows

Text-to-image

Image-to-image regional edit

Z-Image-Turbo Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages