Flux Mini 3B

v1.0 - diffusion_models
azimuthalobserver
about 1 year ago

I'm not the original author of the model, created by TencentARC all credit goes to them.

Important Notes:

  • I have 8GB VRAM GPU and I saw times between 12 seconds and 120 seconds to generate an image between 512x512 to 1024x1024 so it depends what settings you use

  • I highly recommend switching from KSAMPLER to SHARKSAMPLER as it delivers way better results for this model https://huggingface.co/TencentARC/flux-mini/tree/main

    diffusion_models version

    • this is the original format as provided by the authors

    • This is a Flux Transformers Format based model that means Load Checkpoint node will not work in ComfyUI for this version

    • The file must be placed in the diffusion_models directory for it to be found by `Load Diffusion Model` node.

    • To use it you must use the Load Diffusion Model node to load the model and then just the traditional Flux setup for the rest will work.

    checkpoints version

    • this is the format that has been converted by me to work with Load Checkpoint node

    • This is still in Flux Transformers Format but has had the keys of the structure prefixed to work with Load Checkpoint

    • This file must be placed in the checkpoints directory for it to be found by the Load Checkpoint node.

    • To use it you must use the Load Checkpoint node to load the model, it does not have the CLIP and VAE baked in so you must still source that elsewhere

    • Use DualClipLoader and Load VAE to get those into your workflow as usual

    q8 version

    • this is the format that has been converted to unet format and then quantsized to q8 gguf

    • This file must be placed in the unet directory for it to be found by the UNET Loader node.

    • To use this you must have this third party extension installed Unet Loader (GGUF)

    • This reduces the file to 3GB in size and further improvements in speed and total VRAM requirements with only a 0.1% difference between this and the original file.

    aio version (all in one)

    • this is similar to the checkpoints format if you just want to use Load Checkpoint without having to also use DualClipLoader and Load VAE seperately

    • You can use this with the default workflow just pick the file and it should work out the box

    • Baked in T5 Model: t5xxl_fp8_e4m3fn.safetensors

    • Baked in CLIP-L: ViT-L-14-TEXT-detail-improved-hiT-GmP-TE-only-HF.safetensors

    • This one should deliver the fastest results ideally, q8 might still be faster, need further testing to determine.

A 3.2B MMDiT distilled from Flux-dev for efficient text-to-image generation


Nowadays, text-to-image (T2I) models are growing stronger but larger, which limits their practical applicability, especially on consumer-level devices. To bridge this gap, we distilled the 12B Flux-dev model into a 3.2B Flux-mini model, trying to preserve its strong image generation capabilities. Specifically, we prune the original Flux-dev by reducing its depth from 19 + 38 (number of double blocks and single blocks) to 5 + 10. The pruned model is further tuned with denoising and feature alignment objectives on a curated image-text dataset.

We empirically found that different blocks have different impacts on the generation quality, thus we initialize the student model with several most important blocks. The distillation process consists of three objectives: the denoise loss, the output alignment loss as well as the feature alignment loss. The feature alignment loss is designed in a way such that the output of block-x in the student model is encouraged to match that of block-4x in the teacher model. The distillation process is performed with 512x512 Laion images recaptioned with Qwen-VL in the first stage for 90k steps, and 1024x1024 images generated by Flux using the prompts in JourneyDB with another 90k steps.

Github link: https://github.com/TencentARC/flux-toolkits

Read more...

What is Flux Mini 3B?

Flux Mini 3B is a highly specialized Image generation AI Model of type Safetensors / Checkpoint AI Model created by AI community user azimuthalobserver. Derived from the powerful Stable Diffusion (Flux.1 D) model, Flux Mini 3B has undergone an extensive fine-tuning process, leveraging the power of a dataset consisting of images generated by other AI models or user-contributed data. This fine-tuning process ensures that Flux Mini 3B is capable of generating images that are highly relevant to the specific use-cases it was designed for, such as base model.

With a rating of 0 and over 0 ratings, Flux Mini 3B is a popular choice among users for generating high-quality images from text prompts.

Can I download Flux Mini 3B?

Yes! You can download the latest version of Flux Mini 3B from here.

How to use Flux Mini 3B?

To use Flux Mini 3B, download the model checkpoint file and set up an UI for running Stable Diffusion models (for example, AUTOMATIC1111). Then, provide the model with a detailed text prompt to generate an image. Experiment with different prompts and settings to achieve the desired results. If this sounds a bit complicated, check out our initial guide to Stable Diffusion – it might be of help. And if you really want to dive deep into AI image generation and understand how set up AUTOMATIC1111 to use Safetensors / Checkpoint AI Models like Flux Mini 3B, check out our crash course in AI image generation.

Download (5.78 GB) Download available on desktop only
You'll need to use a program like A1111 to run this – learn how in our crash course

Popularity

160 ~10

Info

Base model: Flux.1 D

Latest version (v1.0 - diffusion_models): 1 File

To download these files, please visit this page from a desktop computer.

About this version: v1.0 - diffusion_models

Original File found here:

https://huggingface.co/TencentARC/flux-mini/tree/main

Requires the Load Diffusion Model node in ComfyUI

Requires being placed in the diffusion_models folder

4 Versions

😥 There are no Flux Mini 3B v1.0 - diffusion_models prompts yet!

Go ahead and upload yours!

No results

Your query returned no results – please try removing some filters or trying a different term.