Introduction
This workflow is designed to help artists, designers, and content creators effortlessly transform realistic character images into 3D cartoon-style representations. Whether you're focused on character design, animation, or simply exploring creative possibilities, this workflow is the perfect tool to bring your ideas to life.
We've already configured the parameters, and you don’t need to write any prompts—simply upload your image and receive a 3D cartoon-style output.
Florence2
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks. Florence-2 can interpret simple text prompts to perform tasks like captioning, object detection, and segmentation. It leverages our FLD-5B dataset, containing 5.4 billion annotations across 126 million images, to master multi-task learning. The model's sequence-to-sequence architecture enables it to excel in both zero-shot and fine-tuned settings, proving to be a competitive vision foundation model.
Read more:https://github.com/kijai/ComfyUI-Florence2
Download:https://github.com/kijai/ComfyUI-Florence2.git
Workflow overview
How to use this workflow?
Step 1: Upload Image and Generate PromptImage Upload and Tagging
First, you need to upload an image. Then, the Florence2 can automatically generate a prompt that matches the content of the image.
Step 2: Enhance Character Details with ControlNet
ControlNet Preprocessor-depth provides the overall structure and details; OpenPose ensures the accuracy of the character's posture and movements; Canny Edge offers precise contours and edge information, helping to accurately restore the character's appearance.
Step 3: Transform into 3D Cartoon Style
In this workflow, we use the 3D cartoon_XL_V1.0.safetensors model, which allows you to transform the character into a 3D cartoon style in the final generated image.