Skip to content
/ DrUM Public

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models, ICCV 2025

License

Notifications You must be signed in to change notification settings

Burf/DrUM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DrUM (Draw Your Mind)

HuggingFace Framework: PyTorch Library: diffusers License: MIT

This repository hosts the official implementation of:

Hyungjin Kim, Seokho Ahn, and Young-Duk Seo, Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models, Paper arXiv Link

News

  • [2025.08.05]: Pre-trained weights available on 🤗 HuggingFace
  • [2025.08.05]: Repository created

Introduction

DrUM enables personalized text-to-image (T2I) generation by integrating reference prompts into T2I diffusion models. It works with foundation T2I models such as Stable Diffusion v1/v2/XL/v3 and FLUX, without requiring additional fine-tuning. DrUM leverages condition-level modeling in the latent space using a transformer-based adapter, and integrates seamlessly with open-source text encoders such as OpenCLIP and Google T5.

Performance

Architecture of DrUM

Quick Start

This model is designed for easy use with the diffusers library as a custom pipeline.

Setup

pip install torch torchvision diffusers transformers accelerate safetensors huggingface-hub

Pre-trained weights

Pre-trained adapter weights are available at 🤗 HuggingFace.

Usage

import torch

from drum import DrUM
from diffusers import DiffusionPipeline

# Load pipeline and attach DrUM
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype = torch.bfloat16).to("cuda")
drum = DrUM(pipeline)

# Generate personalized images
images = drum(
    prompt = "a photograph of an astronaut riding a horse",
    ref = ["A retro-futuristic space exploration movie poster with bold, vibrant colors"],
    weight = [1.0],
    alpha = 0.3
)

images[0].save("personalized_image.png")

For interactive usage: see inference.ipynb

For command line usage: see inference.py

Key Parameters

Parameter Description Value
prompt Target prompt String
ref Reference prompts List of strings
alpha Personalization degree Float (0-1)
weight Reference weights List of floats
sampling Reference coreset sampling Boolean

Supported foundation T2I models

DrUM works with a wide variety of foundation T2I models that uses text encoders with same weights:

Architecture Pipeline Text encoder DrUM weight
Stable Diffusion v1 runwayml/stable-diffusion-v1-5, prompthero/openjourney-v4,
stablediffusionapi/realistic-vision-v51,stablediffusionapi/deliberate-v2,
stablediffusionapi/anything-v5, WarriorMama777/AbyssOrangeMix2, ...
openai/clip-vit-large-patch14 L.safetensors
Stable Diffusion v2 stabilityai/stable-diffusion-2-1, ... openai/clip-vit-huge-patch14 H.safetensors
Stable Diffusion XL stabilityai/stable-diffusion-xl-base-1.0, ... openai/clip-vit-large-patch14,
laion/CLIP-ViT-bigG-14-laion2B-39B-b160k
L.safetensors,
bigG.safetensors
Stable Diffusion v3 stabilityai/stable-diffusion-3.5-large
stabilityai/stable-diffusion-3.5-medium, ...
openai/clip-vit-large-patch14,
laion/CLIP-ViT-bigG-14-laion2B-39B-b160k,
google/t5-v1_1-xxl
L.safetensors,
bigG.safetensors,
T5.safetensors
FLUX black-forest-labs/FLUX.1-dev, ... openai/clip-vit-large-patch14,
google/t5-v1_1-xxl
L.safetensors
T5.safetensors

Training

To train your own DrUM: see train.py

Subject transfer

Degree of personalization

Adaptability

Citation

@inproceedings{kim2025drum,
	title={Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models},
	author={Hyungjin Kim, Seokho Ahn, and Young-Duk Seo},
	booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
	year={2025}
}

License

This project is licensed under the MIT License.

About

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models, ICCV 2025

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published