Introduction
UNO is a general customization method launched by the ByteDance team, which focuses on supporting both single-agent and multi-agent conditional generation. The core concept is to build a highly consistent data synthesis process, fully utilize the contextual understanding generation capabilities of the diffusion transformer, and achieve high-quality synthesis of multi-agent data pairing.
UNO uses progressive cross-modal alignment and universal rotation position embedding technology to achieve the ability to evolve from the text-generated image mode to the image generation capability under multi-image conditions.
https://huggingface.co/bytedance-research/UNO
https://github.com/bytedance/UNO
https://github.com/jax-explorer/ComfyUI-UNO