Introduction
ComfyUI-nunchaku is a plugin developed by MIT HAN Lab. The plugin integrates the Nunchaku inference engine with ComfyUI, and is designed to efficiently handle 4-bit quantized neural networks (such as FLUX.1). By using SVDQuant quantization technology, it can reduce memory usage by 3.6 times compared to the BF16 model, and increase the inference speed by 8.7 times compared to the 16-bit model on a 16GB 4090 GPU. The plugin provides nodes such as Nunchaku Flux DiT Loader and Nunchaku Text Encoder Loader, supports tasks such as image generation and completion, and is compatible with multi-LoRA, ControlNet, and FP16, which is very suitable for efficient deep learning workflows with limited resources.
https://github.com/mit-han-lab/ComfyUI-nunchaku
https://huggingface.co/mit-han-lab/svdq-int4-flux.1-dev
https://huggingface.co/mit-han-lab/svdq-int4-flux.1-fill-dev
Recommended machine:Large-PRO
TIP: This plugin supports the full ecological use of FLUX. This workflow provides three workflows to choose from, starting from basic Text2Img, FLUX & ControlNet, and FLUX fill. The following is divided into three parts.
Part 1 : FLUX fill
Workflow Overview
How to use this workflow
Step 1 : Load Image & Smear mask
To add mask for fill inpainting, right click on the uploaded image and select "Open in MaskEditor". Use the brush tool to add masking and click save to continue.
Step 2 : Input the Prompt
No need to describe the whole painting in detail, just enter key information such as features and colors.
Step 3 : Set canvas size
Try to keep it as close to the original image size as possible, otherwise the image will be cropped
Step 4 : Get Image
Part 2 : FLUX ControlNet Union Pro
Workflow Overview
How to use this workflow
Step 1 : Load Image
Step 2 : Select a ControlNet preprocessor
Step 3 : Set canvas size
Try to keep it as close to the original image size as possible, otherwise the image will be cropped
Step 4 : Get Image
Part 3 : FLUX Basic Text2Img
Workflow Overview
How to use this workflow
Step 1 : Input the Prompt
Step 2 : Get Image
Summary
After multiple rounds of testing, the results are as follows:
1. When Step = 25, it only takes 0.3s to generate a 512*512 image, and only 3s to generate a 1024*1024 image
2. cache_threshold is the acceleration node. The larger the value, the faster the acceleration. It is recommended to use 0.12. If you feel that the image quality is reduced, you can reduce this value. Setting it to 0 is also possible