DrUM (Draw Your Mind)

This repository hosts the official implementation of:

Hyungjin Kim, Seokho Ahn, and Young-Duk Seo, Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models, Paper arXiv Link

News

[2025.08.05]: Pre-trained weights available on 🤗 HuggingFace
[2025.08.05]: Repository created

Introduction

DrUM enables personalized text-to-image (T2I) generation by integrating reference prompts into T2I diffusion models. It works with foundation T2I models such as Stable Diffusion v1/v2/XL/v3 and FLUX, without requiring additional fine-tuning. DrUM leverages condition-level modeling in the latent space using a transformer-based adapter, and integrates seamlessly with open-source text encoders such as OpenCLIP and Google T5.

Performance

Architecture of DrUM

Quick Start

This model is designed for easy use with the diffusers library as a custom pipeline.

Setup

pip install torch torchvision diffusers transformers accelerate safetensors huggingface-hub

Pre-trained weights

Pre-trained adapter weights are available at 🤗 HuggingFace.

Usage

import torch

from drum import DrUM
from diffusers import DiffusionPipeline

# Load pipeline and attach DrUM
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype = torch.bfloat16).to("cuda")
drum = DrUM(pipeline)

# Generate personalized images
images = drum(
    prompt = "a photograph of an astronaut riding a horse",
    ref = ["A retro-futuristic space exploration movie poster with bold, vibrant colors"],
    weight = [1.0],
    alpha = 0.3
)

images[0].save("personalized_image.png")

For interactive usage: see inference.ipynb

For command line usage: see inference.py

Key Parameters

Parameter	Description	Value
prompt	Target prompt	String
ref	Reference prompts	List of strings
alpha	Personalization degree	Float (0-1)
weight	Reference weights	List of floats
sampling	Reference coreset sampling	Boolean

Supported foundation T2I models

DrUM works with a wide variety of foundation T2I models that uses text encoders with same weights:

Architecture	Pipeline	Text encoder	DrUM weight
Stable Diffusion v1	`runwayml/stable-diffusion-v1-5`, `prompthero/openjourney-v4`, `stablediffusionapi/realistic-vision-v51`,`stablediffusionapi/deliberate-v2`, `stablediffusionapi/anything-v5`, `WarriorMama777/AbyssOrangeMix2`, ...	`openai/clip-vit-large-patch14`	`L.safetensors`
Stable Diffusion v2	`stabilityai/stable-diffusion-2-1`, ...	`openai/clip-vit-huge-patch14`	`H.safetensors`
Stable Diffusion XL	`stabilityai/stable-diffusion-xl-base-1.0`, ...	`openai/clip-vit-large-patch14`, `laion/CLIP-ViT-bigG-14-laion2B-39B-b160k`	`L.safetensors`, `bigG.safetensors`
Stable Diffusion v3	`stabilityai/stable-diffusion-3.5-large` `stabilityai/stable-diffusion-3.5-medium`, ...	`openai/clip-vit-large-patch14`, `laion/CLIP-ViT-bigG-14-laion2B-39B-b160k`, `google/t5-v1_1-xxl`	`L.safetensors`, `bigG.safetensors`, `T5.safetensors`
FLUX	`black-forest-labs/FLUX.1-dev`, ...	`openai/clip-vit-large-patch14`, `google/t5-v1_1-xxl`	`L.safetensors` `T5.safetensors`

Training

To train your own DrUM: see train.py

Subject transfer

Degree of personalization

Adaptability

Citation

@inproceedings{kim2025drum,
	title={Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models},
	author={Hyungjin Kim, Seokho Ahn, and Young-Duk Seo},
	booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
	year={2025}
}

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
asset		asset
drum		drum
image		image
LICENSE		LICENSE
README.md		README.md
download_weights.py		download_weights.py
inference.ipynb		inference.ipynb
inference.py		inference.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DrUM (Draw Your Mind)

News

Introduction

Performance

Architecture of DrUM

Quick Start

Setup

Pre-trained weights

Usage

Key Parameters

Supported foundation T2I models

Training

Subject transfer

Degree of personalization

Adaptability

Citation

License

About

Uh oh!

Releases

Packages

Languages

License

Burf/DrUM

Folders and files

Latest commit

History

Repository files navigation

DrUM (Draw Your Mind)

News

Introduction

Performance

Architecture of DrUM

Quick Start

Setup

Pre-trained weights

Usage

Key Parameters

Supported foundation T2I models

Training

Subject transfer

Degree of personalization

Adaptability

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages