Weihao Zeng Zeng-WH

I am Weihao Zeng, a PhD student supervised by Prof. Junxian He at the Hong Kong University of Science and Technology starting in the fall of 2025.

My main focus is on the post-training aspect of LLMs, specifically including:

Improving model reasoning capabilities using reinforcement learning (RL) / self-evolution techniques (SimpleRL, B-STaR)
Exploring efficient data engineering methods for post-training (Deita, Auto Evol-Instruct)
The application of LLMs in task-oriented dialogue systems (FutureTOD, Seen2UnSeen)

Feel free to email me for any form of academic cooperation: [email protected]

🔥 News

2025-03: We introduce SimpleRL-Zoo, a deep investigation of zero RL training across diverse model families and sizes! SimpleRL-Zoo Twitter
2025-01: Announce our latest effort on O/R-1 Style Model and Scalable Reinforcement Learning for LLM Reasoning! SimpleRL Twitter
2025-01: 🎉🎉 Our B-STaR have been accepted by ICLR 2025!
2024-09: 🎉🎉 Our Auto Evol-Instruct have been accepted by EMNLP 2024!
2024-01: 🎉🎉 Our Deita have been accepted by ICLR 2024!
2023-05: 🎉🎉 Two paper have been accepted by ACL 2023!

📝 Publications

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Weihao Zeng*, Yuzhen Huang*, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He

Preprint SimpleRL-Zoo Github
7B Model and 8K Examples: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient

Weihao Zeng*, Yuzhen Huang*, Wei Liu, Keqing He, Qian Liu, Zejun Ma, Junxian He

Preprint SimpleRL Twitter Github
B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Weihao Zeng*, Yuzhen Huang*, Lulu Zhao, Yijun Wang, Zifei Shan, Junxian He

ICLR 2025 paper
FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue

Weihao Zeng, Keqing He, Yejie Wang, Chen Zeng, Jingang Wang, Yunsen Xian, Weiran Xu

ACL 2023 Main Conference paper
Seen to Unseen: Exploring Compositional Generalization of Multi-Attribute Controllable Dialogue Generation

Weihao Zeng, Lulu Zhao, Keqing He, Ruotong Geng, Jingang Wang, Wei Wu, Weiran Xu

ACL 2023 Main Conference paper
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

Wei Liu*, Weihao Zeng*, Keqing He, Yong Jiang, Junxian He

ICLR 2024 paper
Automatic Instruction Evolving for Large Language Models

Weihao Zeng, Can Xu, Yingxiu Zhao, Jian-Guang Lou, Weizhu Chen

EMNLP 2024 paper

Full Publications on Google Scholar

🔥 Invited Talks

April 2025, Qingke Talk, SimpleRL-Zoo and B-STaR: Improving reasoning performance and efficiency through reinforcement learning
Mar 2025, Westlake University, SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild.
Feb 2025, Northwestern University, SimpleRL: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient.
Feb 2025 Tiktok, SimpleRL: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient.
Feb 2025, Huawei Noah's Ark Lab, SimpleRL: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient.

🎖 Competitions and Awards

National Scholarship in China (2019/2023)
2022-09: 🏆🏆 Achieved the 1st Award on SereTOD Challenge 2022 track 2, EMNLP 2022!
2021-08: 🏆🏆 Achieved the 4th Award on SMP 2021 Conversational AI Challenge!
2021-09: 🏆🏆 Achieved the 8th Place on CCIR 2021 Intelligent NLU Challenge!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weihao Zeng Zeng-WH

Achievements

Achievements

Block or report Zeng-WH

🔥 News

📝 Publications

🔥 Invited Talks

🎖 Competitions and Awards

Pinned Loading

Uh oh!