REDCODER: Automated Multi-Turn Red Teaming for Code LLMs

Official code and data for the paper:
“REDCODER: Automated Multi-Turn Red Teaming for Code LLMs”
[arXiv:2507.22063]

🚀 Overview

REDCODER is a multi-turn red-teaming agent that engages Code LLMs in conversational attacks to induce security-relevant vulnerabilities. It is built via a multi-agent gaming process that produces:

(1) Prototype adversarial conversations
(2) A strategy arsenal for retrieval-augmented attacks

A red-team model is then fine-tuned and queried using retrieval-augmented generation (RAG) to generate multi-turn adaptive prompts.

Key highlights:

Multi-turn attacks using learned strategy patterns
Outperforms previous attack baselines (e.g., 65.29% attack success on Qwen2.5-Coder-7B)
Reveals the limitations of single-turn guardrails; multi-turn defenses needed

🔧 Installation

Python: 3.9–3.11 recommended

git clone https://github.com/luka-group/RedCoder.git
cd RedCoder
pip install -r requirements.txt

If using API-based models (e.g., OpenAI), set your API keys (e.g., OPENAI_API_KEY).

⚙️ Quickstart

1) Run REDCODER Against a Victim Model

python redcoder.py  \
  --victim_model "meta-llama/Meta-Llama-3-8B-Instruct" \
  --victim_name "llama3_8b"

2) Run gaming process to collect your own prototype conversations.

python gaming_process.py

Model 🤖️ and 📦 Data

We release the REDCODER backbone model and relevant assets on Hugging Face 🤗: 🔗 jackysnake/RedCoder
gaming_cwe.txt — CWE vulnerability task prompts for prototype generation
eval_set.txt — CWE tasks for evaluating REDCODER performance
prototype_conversation.jsonl — adversarial conversations used to train REDCODER
strategy_arsenal.json — extracted tactics and prompt fragments for RAG-based prompting

📝 Citation

If you find this work useful, please cite:

@article{mo2025redcoder,
  title   = {REDCODER: Automated Multi-Turn Red Teaming for Code LLMs},
  author  = {Wenjie Jacky Mo and Qin Liu and Xiaofei Wen and Dongwon Jung and
             Hadi Askari and Wenxuan Zhou and Zhe Zhao and Muhao Chen},
  journal = {arXiv preprint arXiv:2507.22063},
  year    = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
gaming_process		gaming_process
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pipeline_overview.png		pipeline_overview.png
rag.py		rag.py
redcoder.py		redcoder.py
requirements.txt		requirements.txt
run_redcoder.sh		run_redcoder.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

REDCODER: Automated Multi-Turn Red Teaming for Code LLMs

🚀 Overview

🔧 Installation

⚙️ Quickstart

1) Run REDCODER Against a Victim Model

2) Run gaming process to collect your own prototype conversations.

Model 🤖️ and 📦 Data

📝 Citation

About

Uh oh!

Releases

Packages

Languages

License

luka-group/RedCoder

Folders and files

Latest commit

History

Repository files navigation

REDCODER: Automated Multi-Turn Red Teaming for Code LLMs

🚀 Overview

🔧 Installation

⚙️ Quickstart

1) Run REDCODER Against a Victim Model

2) Run gaming process to collect your own prototype conversations.

Model 🤖️ and 📦 Data

📝 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages