Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

A novel self-backtracking method for improving language model reasoning.

Overview

This repository implements the Self-BackTracking method, that equips LLMs with the ability to backtrack during both training and inference. This mechanism not only enhances reasoning ability but also efficiency by transforming slow-thinking processes into fast-thinking through self-improvement.

Dataset and Model

The project utilizes the Countdown dataset, which is pre-constructed and accessible on Hugging Face. Additionally, we have open-sourced our trained model based on Llama-3.2-1B.

Getting Started

Training

To train the model:

CUDA_VISIBLE_DEVICES=0 python train.py \
    --config ../configs/sft.conf

You can change the parameters in the configs/sft.conf file.

If you want to use multiple GPUs:

accelerate launch \
    --config_file ../configs/accelerate.yaml \
    train.py \
    --config ../configs/sft.conf

Inference

To inference the model using our self-backtracking method, you can run the following command:

CUDA_VISIBLE_DEVICES=0 python eval_search.py \
    --num 5000 \
    --ckpt [your_model_ckpt] \
    --data [val/val_new] \
    --decoder self_backtrack \
    --b 1 \
    --n 32

--ckpt defaults to yangxw/Llama-3.2-1B-countdown-backtrack. You can use our trained model available on Hugging Face.

Self-Improvement

To further improve the model, you can run the following command:

CUDA_VISIBLE_DEVICES=0 python train_self_improvement.py \
    --num 5000 \
    --past_model [your_model_ckpt] \
    --data [val/val_new]

Results

Citation

If you use this work, please cite it as follows:

@article{selfbacktracking,
  title={Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models},
  author={Xiao-Wen Yang and Xuan-Yi Zhu and Wen-Da Wei and Ding-Chu Zhang and Jie-Jing Shao and Zhi Zhou and Lan-Zhe Guo and Yu-Feng Li},
  journal={arXiv preprint arXiv:2502.04404},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
images		images
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

Overview

Dataset and Model

Getting Started

Training

Inference

Self-Improvement

Results

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

LAMDASZ-ML/Self-Backtracking

Folders and files

Latest commit

History

Repository files navigation

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

Overview

Dataset and Model

Getting Started

Training

Inference

Self-Improvement

Results

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages