Skip to content

Conversation

@pratim4dasude
Copy link
Contributor

What does this PR do

This PR adds a new pipeline — FluxFillControlNetInpaintPipeline — located in pipeline_flux_fill_controlnet_inpaint.py.

This pipeline extends FLUX.1-Fill-dev with full ControlNet support for depth, canny, union, and other conditioning models. It enables fill-style inpainting + ControlNet conditioning in a single unified workflow.

We chose FLUX.1-Fill-dev instead of the main FLUX.1-dev model because the regular model does not handle inpainting or masked edits well, especially when combined with styling from Flux Redux.

This variant is specifically designed for mask-based inpainting and produces far more stable and coherent results in these workflows.

How I identified the gap

Existing FLUX pipelines were split:

FluxFillPipeline → fill inpainting only
FluxControlNetInpaintPipeline → ControlNet, with inpainting but no feature for the fill based model

There was no single pipeline combining all three.

How to Use the New Pipeline

Below is the updated example with the correct pipeline name and file import:

import torch
from diffusers import (
    FluxControlNetModel,
    FluxPriorReduxPipeline,
)
from diffusers.utils import load_image

# NEW PIPELINE (updated name)
from pipline_flux_fill_controlnet_Inpaint import  FluxFillControlNetInpaintPipeline

device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16

# Models
base_model = "black-forest-labs/FLUX.1-Fill-dev"
controlnet_model = "Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0"
prior_model = "black-forest-labs/FLUX.1-Redux-dev"

# Load ControlNet
controlnet = FluxControlNetModel.from_pretrained(
    controlnet_model,
    torch_dtype=dtype,
)

# Load Fill + ControlNet Pipeline
fill_pipe = FluxFillControlNetInpaintPipeline.from_pretrained(
    base_model,
    controlnet=controlnet,
    torch_dtype=dtype,
).to(device)

# OPTIONAL FP8
# fill_pipe.transformer.enable_layerwise_casting(
#     storage_dtype=torch.float8_e4m3fn,
#     compute_dtype=torch.bfloat16
# )

#  OPTIONAL Prior Redux
#pipe_prior_redux = FluxPriorReduxPipeline.from_pretrained(
#    prior_model,
#    torch_dtype=dtype,
#).to(device)

# Inputs
cloth_image = load_image("images.png")
cloth_prompt = "A clean, minimal outfit"

combined_image = load_image("person_input.png")
combined_mask = load_image("mask.png")
control_image_depth = load_image("control_depth.png")

# 1. Prior conditioning
#prior_out = pipe_prior_redux(
#    image=cloth_image,
#    prompt=cloth_prompt,
#)

# 2. Fill Inpaint with ControlNet
result = fill_pipe(
    prompt="A woman wearing the outfit, futuristic and stylish.",
    image=combined_image,
    mask_image=combined_mask,

    control_image=control_image_depth,
    control_mode=[2],  # union mode
    control_guidance_start=0.0,
    control_guidance_end=0.5,
    controlnet_conditioning_scale=0.7,

    height=1024,
    width=768,

    strength=1.0,
    guidance_scale=50.0,
    num_inference_steps=60,
    max_sequence_length=512,

#    **prior_out,
)

result.images[0].save("flux_fill_controlnet_inpaint.png")


Who can review

Anyone in the community is free to review the PR once the tests have passed.
I'm new to contributing here, so please feel free to point out mistakes or roast the code if needed - it will help me improve.
@yiyixuxu and @asomoza

@asomoza
Copy link
Member

asomoza commented Nov 15, 2025

hi @pratim4dasude can you post a couple of images so we can see the quality of using this pipeline?

@pratim4dasude
Copy link
Contributor Author

Added a few sample outputs generated using this pipeline so you can get a clear view of the inpainting quality and how the fill + ControlNet combination behaves.

Single Input & Mask Applied to Every Scenario

1111

Flux Fill + ControlNet Depth-Guided Inpainting

Prompt : a dog on a bench

dfsdsdfsdf

Prompt : a dog

99999

Prompt : a women in yoga

77777

Flux Fill + ControlNet Pose-Guided Inpainting

Prompt : a women in yoga

ethrthr

Prompt : a man in yoga

rwwerrwr

Flux Fill + ControlNet Canny-Guided Inpainting

Prompt : a dog on a bench

5555

Prompt : a cat on a bench

6666

Prompt : a dog

44444

Prompt : a cat

222222

Prompt : a women in yoga

333333

CODE - Used

import torch
from diffusers import (
    FluxControlNetModel,
    FluxPriorReduxPipeline,
)
from diffusers.utils import load_image

# NEW PIPELINE (updated name)
from new_pipline_fluxControlnetFill_Inpaint import  FluxControlNetFillInpaintPipeline

device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16

# Models
base_model = "black-forest-labs/FLUX.1-Fill-dev"
controlnet_model = "Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0"
prior_model = "black-forest-labs/FLUX.1-Redux-dev"

# Load ControlNet
controlnet = FluxControlNetModel.from_pretrained(
    controlnet_model,
    torch_dtype=dtype,
)

# Load Fill + ControlNet Pipeline
fill_pipe = FluxControlNetFillInpaintPipeline.from_pretrained(
    base_model,
    controlnet=controlnet,
    torch_dtype=dtype,
).to(device)

# OPTIONAL FP8
# fill_pipe.transformer.enable_layerwise_casting(
#     storage_dtype=torch.float8_e4m3fn,
#     compute_dtype=torch.bfloat16
# )

#  OPTIONAL Prior Redux
#pipe_prior_redux = FluxPriorReduxPipeline.from_pretrained(
#    prior_model,
#    torch_dtype=dtype,
#).to(device)

# Inputs

# combined_image = load_image("person_input.png")


# 1. Prior conditioning
#prior_out = pipe_prior_redux(
#    image=cloth_image,
#    prompt=cloth_prompt,
#)

# 2. Fill Inpaint with ControlNet

# canny (0), tile (1), depth (2), blur (3), pose (4), gray (5), low quality (6).

img = load_image(r"imgs/background.jpg")
mask = load_image(r"imgs/mask.png")

control_image_depth = load_image(r"imgs/dog_depth _2.png")

result = fill_pipe(
    prompt="a dog on a bench",
    image=img,
    mask_image=mask,

    control_image=control_image_depth,
    control_mode=[2],  # union mode
    control_guidance_start=0.0,
    control_guidance_end=0.8,
    controlnet_conditioning_scale=0.9,

    height=1024,
    width=1024,

    strength=1.0,
    guidance_scale=50.0,
    num_inference_steps=60,
    max_sequence_length=512,

#    **prior_out,
)

# result.images[0].save("flux_fill_controlnet_inpaint.png")

from datetime import datetime
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
result.images[0].save(f"flux_fill_controlnet_inpaint_depth{timestamp}.jpg")

I’ve uploaded the pipeline outputs and the corresponding generation code.

These examples cover each scenario to help evaluate the pipeline more thoroughly.

@asomoza

@asomoza
Copy link
Member

asomoza commented Nov 19, 2025

thanks @pratim4dasude, looking at the results, it doesn't seem to be on par or better than the current ones we have (qwen-image-edit-plus, kontext). I'm guessing some people will have some specific use case scenarios where they would like this pipeline, are you interested in moving it to a community pipeline in the meantime to measure if people will use it?

Just so you know, qwen-image-edit can work with multiple images and condition images too (depth, pose, etc)

@pratim4dasude
Copy link
Contributor Author

Thanks @asomoza for checking it out! Yeah, that makes sense — the results aren’t yet matching qwen-image-edit-plus or kontext, and I get why. This pipeline was more of an experiment around Flux-Fill + separate condition inputs (depth / canny / pose), so it might be useful only for some niche workflows.

I’m open to moving it to a community pipeline so others can try it out. Can you help me with the process, or should I just create a new PR under diffusers/examples/community/? Happy to follow whatever you suggest.

Copy link
Contributor Author

@pratim4dasude pratim4dasude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flux fill controlnet pipeline so now it been in the community work and also updated the readme.md

@asomoza

Copy link
Member

@asomoza asomoza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@asomoza
Copy link
Member

asomoza commented Nov 19, 2025

@bot /style

@github-actions
Copy link
Contributor

github-actions bot commented Nov 19, 2025

Style bot fixed some files and pushed the changes.

@asomoza asomoza merged commit d5da453 into huggingface:main Nov 19, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants