Introduction
This workflow integrates the excellent text - to - image generation capabilities of the FLUX model with the robust reasoning power of the Deepseek - Janus - pro - 7B model. It is capable of achieving reverse inference of image prompt words or conducting data annotation during Lora training.
Deepseek model: Janus-Pro-7B:
Janus-Pro is a new autoregressive model that combines multimodal understanding and generation in one framework. It improves on past methods by separating visual encoding into different pathways, while still using a single transformer for processing. This separation reduces conflicts between understanding and generation tasks, making the model more flexible. Janus-Pro outperforms previous unified models and matches or even beats specialized task-specific models. Its simplicity, flexibility, and strong performance make it a promising choice for the next generation of multimodal AI models.
https://huggingface.co/deepseek-ai/Janus-Pro-7B