SOFT

This is the implementation of SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks (USENIX Security'25).

Code Structure

mia_llms_benchmark/
├── README.md                       # This file
├── environment.yml                 # Conda environment specification
├── config_finetune.yaml            # Training configuration
├── config_auc_tpr.yaml             # Evaluation configuration
├── finetune.py                     # Main fine-tuning script
├── main.py                         # Evaluation script
├── utils.py                        # Utility functions
├── data/
│   ├── obfuscation.py              # Obfuscation implementations
│   └── prepare.py                  # Dataset loading and tokenization
├── attacks/                        # MIA attack implementations
│   ├── __init__.py                 
│   ├── loss.py                     
│   ├── ratio.py                    
│   ├── mink.py                     
│   ├── minkplusplus.py             
│   ├── zlib.py                     
│   ├── lowercase.py                
│   ├── recall.py                   
│   ├── conrecall.py                
│   ├── bag_of_words.py             
│   ├── ensemble_classifier.py      
│   └── utils.py                    # Attack utilities
└── output/                         # Evaluation results

Quick Start

1. Install Dependencies

# Create python environment
conda env create -f environment.yml
conda activate mia

2. Fine-tune Model with Defense

# Single GPU training
python finetune.py --config config_finetune.yaml --select_ratio X

# Multi-GPU training with DeepSpeed
deepspeed --num_gpus=8 finetune.py --config config_finetune.yaml --select_ratio X

3. Evaluate Privacy Protection

The metrics include AUC-ROC, [email protected], and [email protected].

python main.py \
    -c config_auc_tpr.yaml \
    --run-all \
    --output "./output/" \
    --target-model "checkpoints/Llama-3.2-X/epoch-X" \
    --dataset "arxiv" \
    --split "ngram_13_0.8"

Dataset Information

Original Dataset

Source: iamgroot42/mimir
Description: Curated subset of The Pile dataset with membership labels
Splits: Various n-gram and threshold combinations (e.g., ngram_13_0.8)
Domains: ArXiv papers, Wikipedia, GitHub code, PubMed, and more

Example of Obfuscated Dataset

Source: LLM-MIA/editing-syn-pr0.5-mimir-arxiv-ngram_13_0.8
Description: Paraphrased version of the ArXiv subset using advanced text transformation
Usage: Ready-to-use obfuscated data for immediate training

Data Obfuscation

Generate Your Own Obfuscated Data

The data/obfuscation.py module provides tools to create obfuscated datasets:

# Set up environment variables
export OPENAI_API_KEY="your-api-key"
export HF_TOKEN="your-huggingface-token"

# Using OpenAI API for paraphrasing
python data/obfuscation.py

Obfuscation Prompts

The framework supports different prompts for various content types:

Text Paraphrasing Prompt:

message = [
    {"role": "system", "content": "You are a helpful text rewriting assistant."},
    {"role": "user", "content":
     f"Rewrite the following paragraph by replacing every word with an alternative term that does not share the same root or spelling. Preserve the same meaning and sentence structure as much as possible.\n\"\"\"\n{original_text}\n\"\"\""},
]

Code Obfuscation Prompt:

message = f"Rewrite the following code so it preserves the same functionality and flow, but changes all variable names, function names, and comments. Maintain the same input-output behavior. Keep it in the same programming language.\n\"\"\"\n{original_text}\n\"\"\""

Evaluation

Available Attack Methods

The framework implements 10+ state-of-the-art MIA attacks:

Attack Method	Description	Key Parameters
Loss	Basic loss-based attack	-
Zlib	Compression-based attack	-
Lowercase	Case-sensitivity attack	-
Min-K% Prob	Minimum k-probability attack	`k`
Min-K%++	Enhanced MinK with calibration	`k`
Ratio	Loss ratio with reference model	`reference_model_path`
Bag of Words	Feature-based ML attack	-
ReCall	Prefix-based recall attack	`n_shots`, `extra_non_member_dataset`
CON-ReCall	Conditional recall attack	`n_shots`, `extra_non_member_dataset`
Ensemble	Combined multiple attacks	-

Custom Evaluation

# Evaluate specific attacks only
python main.py \
    -c config_auc_tpr.yaml \
    --attacks "loss,ratio,mink" \
    --target-model "path/to/model" \
    --dataset "arxiv" \
    --split "ngram_13_0.8"

Citation

If you use this framework in your research, please cite:

@inproceedings{zhang2025soft,
    title = {SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks},
    author = {Zhang, Kaiyuan and Cheng, Siyuan and Guo, Hanxi and Chen, Yuetian and Su, Zian and An, Shengwei and Du, Yuntao and Fleming, Charles and Kundu, Ashish and Zhang, Xiangyu and Li, Ninghui},
    booktitle = {34th USENIX Security Symposium (USENIX Security 25)},
    year = {2025},
    address = {Seattle, WA},
    publisher = {USENIX Association}
}

Acknowledgments

Mimir Dataset for providing the evaluation benchmark
The Pile for the underlying text corpus
HuggingFace for the model and dataset hosting infrastructure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SOFT

Table of Contents

Code Structure

Quick Start

1. Install Dependencies

2. Fine-tune Model with Defense

3. Evaluate Privacy Protection

Dataset Information

Original Dataset

Example of Obfuscated Dataset

Data Obfuscation

Generate Your Own Obfuscated Data

Obfuscation Prompts

Evaluation

Available Attack Methods

Custom Evaluation

Citation

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
attacks		attacks
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config_auc_tpr.yaml		config_auc_tpr.yaml
config_finetune.yaml		config_finetune.yaml
environment.yml		environment.yml
finetune.py		finetune.py
main.py		main.py
utils.py		utils.py

License

KaiyuanZh/SOFT

Folders and files

Latest commit

History

Repository files navigation

SOFT

Table of Contents

Code Structure

Quick Start

1. Install Dependencies

2. Fine-tune Model with Defense

3. Evaluate Privacy Protection

Dataset Information

Original Dataset

Example of Obfuscated Dataset

Data Obfuscation

Generate Your Own Obfuscated Data

Obfuscation Prompts

Evaluation

Available Attack Methods

Custom Evaluation

Citation

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages