Introduction
This innovative workflow cleverly integrates cutting-edge technologies to open up a new creative experience. It focuses on the core tool JoyCaption2, fully tapping its powerful potential for reverse-inference prompt words, and showing its prowess in the image-to-text link. When a picture is input, JoyCaption2 can quickly and accurately reverse-infer the prompt words that match it based on image features, colors, composition and other factors, providing accurate text guidance for subsequent creation. Then, it joins hands with FLUX. With its excellent natural language detection capabilities, FLUX conducts in-depth analysis of the prompt words generated by JoyCaption2, screens and optimizes them, and then starts the secondary image generation process, accurately converting text information into lifelike images. Not only that, in order to bring great convenience to users, the workflow also carefully integrates all the ecological functions of JoyCaption2, and presents them intuitively in groups. Users only need to click on demand to quickly call them. Whether it is creative inspiration, text optimization or image conversion, it can be done in one go, greatly improving the efficiency and quality of creation.
JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.
Key Features:
- Free and Open: It will be released for free, open weights, no restrictions, and just like bigASP, will come with training scripts and lots of juicy details on how it gets built.
- Uncensored: Equal coverage of SFW and NSFW concepts. No "cylindrical shaped object with a white substance coming out on it" here.
- Diversity: All are welcome here. Do you like digital art? Photoreal? Anime? Furry? JoyCaption is for everyone. Pains are being taken to ensure broad coverage of image styles, content, ethnicity, gender, orientation, etc.
- Minimal Filtering: JoyCaption is trained on large swathes of images so that it can understand almost all aspects of our world. almost. Illegal content will never be tolerated in JoyCaption's training.
https://github.com/fpgaminer/joycaption
https://github.com/EvilBT/ComfyUI_SLK_joy_caption_two/blob/main/readme_us.md
Workflow Overview
How to use this workflow
Step 1: JoyCaption 2 & FLUX combined application
The first Group is a simple combination of a basic single-processed image and a FLUX text-to-text image
1.Load Imgae
2.Choose Lora:This image has three built-in Lora styles for reference
CK:Flat cartoon style sexy lora, can generate NSFW content
Erotic:2.5D portrait sexy lora, can generate characters with uniform facial features, support NSFW content
Anime:Two-dimensional style lora, used to enhance the picture
3.Get Image
Step 2: JoyCaption 2 Single image processing
1.Load Imgae
2.Get Prompt
Step 3: JoyCaption 2 Batch processing
Enter the address you need to read and the output address. The final output result will not display specific tags. You need to go to the output address you set to summarize and view the txt text. For detailed methods, refer to the picture as shown
Step 4: JoyCaption 2 Advanced gameplay
Only the Prompt correction function has been added to make the Prompt more focused. See the content displayed in the picture in detail