BackdoorBenchER: Evaluating & Revisiting the Auxiliary Data in Backdoor Purification

📢 Announcements

Updated on 2025-04-02: Now, it supports two new clean-label attacks: COMBAT (AAAI 2024) and Narcissus (CCS 2023), and one new data-free defense: OTBR (AAAI 2025).

Updated on 2025-03-26: Based on latest feedback from reviewers, we have added support for the ImageNette dataset. ImageNette is a subset of the ImageNet dataset, featuring a significantly larger size compared to datasets in the CIFAR family, Tiny ImageNet, and GTSRB.

Updated on 2025-03-04: Now, it supports synthetic dataset based on MMGen. You can follow the guidance from MMGen to train or download the pretrained weight, to generate the synthetic dataset.

Updated on 2025-02-12: Initial release now available. It supports the evaluation of backdoor purification on auxiliary datasets categorized as Seen (Train), Reserved (Split), and OOD (Transformations & External from ImageNet).

📝 Introduction

Welcome to the official repository for the paper "Revisiting the Auxiliary Data in Backdoor Purification". This project aims to establish a framework for evaluating backdoor purification techniques under practical conditions using diverse auxiliary datasets, moving beyond the assumption of idealized, in-distribution data.

📊 Project Overview

Backdoor attacks exploit vulnerabilities during model training to induce specific behaviors when triggered. To counteract these threats, backdoor purification techniques are employed, often relying on a small clean dataset known as auxiliary data. Despite advancements, the impact of auxiliary data characteristics on purification efficacy remains understudied. This project investigates how different types of auxiliary datasets—ranging from in-distribution to synthetic or externally sourced—affect purification outcomes, providing insights crucial for selecting or constructing effective defense mechanisms.

🛠️ Getting Started

Follow these steps to set up the project:

Clone Repository

git clone https://github.com/shawkui/BackdoorBenchER.git
cd BackdoorBenchER

Install Dependencies
```
bash sh/install.sh
```
Initialize Folders
```
bash sh/init_folders.sh
```

⚙️ Usage Instructions

🧪 Creating Auxiliary Datasets

For example, with CIFAR-10:

Download the dataset into /data.

Split it:

python dataset/generate_split.py --dataset cifar10 --split_ratio 0.05 --random_seed 0

Generate OOD auxiliary data:

python dataset/generate_ood.py --dataset cifar10_split_5_seed_0 --ood_type brightness

Create a CIFAR-10-like dataset from ImageNet:

bash sh/cinic_download.sh
python dataset/generate_cifar10_from_imagenet.py --dataset cifar10_split_5_seed_0 --ood_type imagenet

🛡️ Performing Attacks & Defenses

Simulate an attack:

python attack/badnet.py --save_folder_name badnet_demo --dataset cifar10_split_5_seed_0

Apply a defense:

python defense/ft.py --result_file badnet_demo --dataset cifar10_split_5_seed_0 --reserved_type reserved

Customize configurations for all methods by editing sh/config_edit.py.

📄 Managing Results

All defense results are saved according to configurations specified in the --yaml_path argument.

For example,

python defense/ft.py --result_file badnet_demo --dataset cifar10_split_5_seed_0 --reserved_type reserved --yaml_path ./config/defense/ft/demo.yaml

will save the results in record/badnet_demo/defense/ft/demo/

📋 TODO

📅 Upcoming Features:

Release Code for Generating Synthetic Data: We will soon provide code to generate synthetic auxiliary datasets, expanding the variety of datasets available for testing and evaluation.
Release Dataset: In addition to the code, we plan to release a curated dataset specifically designed for backdoor purification research.
Release Guided Input Calibration: We plan to release Guided Input Calibration, the first attempt to align auxiliary datasets with in-distribution datasets, facilitating more effective backdoor purification.
More evaluation: Our long-term updates will include a broader evaluation framework, incorporating additional purification techniques, models, and tasks. Contributions from the research community are highly encouraged.

Stay tuned for updates!

📄 Citation

Please cite our work if used in your research:

@misc{wei2025revisitingauxiliarydatabackdoor,
      title={Revisiting the Auxiliary Data in Backdoor Purification}, 
      author={Shaokui Wei and Shanchao Yang and Jiayin Liu and Hongyuan Zha},
      year={2025},
      eprint={2502.07231},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2502.07231}, 
}

🎖️ Acknowledgments

Our work builds upon BackdoorBench. Consider giving them a star if their work is useful to you.

Our work is built upon previous works, including but not limited to:

📞 Contact

For inquiries or feedback, open an issue or email [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
analysis		analysis
asset		asset
attack		attack
config		config
dataset		dataset
defense		defense
models		models
resource		resource
sh		sh
utils		utils
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BackdoorBenchER: Evaluating & Revisiting the Auxiliary Data in Backdoor Purification

📢 Announcements

📝 Introduction

📊 Project Overview

🛠️ Getting Started

⚙️ Usage Instructions

🧪 Creating Auxiliary Datasets

🛡️ Performing Attacks & Defenses

📄 Managing Results

📋 TODO

📄 Citation

🎖️ Acknowledgments

📞 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

shawkui/BackdoorBenchER

Folders and files

Latest commit

History

Repository files navigation

BackdoorBenchER: Evaluating & Revisiting the Auxiliary Data in Backdoor Purification

📢 Announcements

📝 Introduction

📊 Project Overview

🛠️ Getting Started

⚙️ Usage Instructions

🧪 Creating Auxiliary Datasets

🛡️ Performing Attacks & Defenses

📄 Managing Results

📋 TODO

📄 Citation

🎖️ Acknowledgments

📞 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages