Introduction
Cosmos is an advanced open - source model that has made a significant impact in the realm of AIGC, especially in the area of generating high - quality images from text prompts (text - to - image) and transforming existing images into new ones (image - to - image). It offers a unique set of capabilities that allow for creative and realistic image generation.
Workflow Overview
Text2video workflow
Image2video workflow
Installation of Nodes and Models
Nodes:
- For ComfyUI, which is a popular framework to use with Cosmos, you need to ensure that it is up - to - date. Some Cosmos - specific nodes might be available as custom extensions. These can usually be installed by following the official ComfyUI documentation on adding custom nodes. This often involves downloading the relevant node files and placing them in the appropriate ComfyUI directories.
Models:
- Text Encoder and VAE
These can be sourced from dedicated repositories. For example, the text encoderoldt5_xxl_fp8_e4m3fn_scaled.safetensors
and the VAEcosmos_cv8x8x8_1.0.safetensors
are available at https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/tree/main. Place the text encoder in theComfyUI/models/text_encoders
directory and the VAE in theComfyUI/models/vae
directory. - Diffusion Models
The diffusion models, which are pivotal for the video generation process, can be found in safetensors format at https://huggingface.co/mcmonkey/cosmos - 1.0/tree/main. They should be placed in theComfyUI/models/diffusion_models
folder. If you prefer the original.pt
format, the official links can be found on the Hugging Face repositories, such as those related to text - to - video models like https://huggingface.co/nvidia/Cosmos - 1.0 - Diffusion - 7B - Text2World.