Skip to content

LAMDASZ-ML/Self-Backtracking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

Arxiv Hugging Face GitHub last commit Home Page GitHub stars

A novel self-backtracking method for improving language model reasoning.

Self-Backtracking Method

Overview

This repository implements the Self-BackTracking method, that equips LLMs with the ability to backtrack during both training and inference. This mechanism not only enhances reasoning ability but also efficiency by transforming slow-thinking processes into fast-thinking through self-improvement.

Dataset and Model

The project utilizes the Countdown dataset, which is pre-constructed and accessible on Hugging Face. Additionally, we have open-sourced our trained model based on Llama-3.2-1B.

Dataset Model

Getting Started

Training

To train the model:

CUDA_VISIBLE_DEVICES=0 python train.py \
    --config ../configs/sft.conf

You can change the parameters in the configs/sft.conf file.

If you want to use multiple GPUs:

accelerate launch \
    --config_file ../configs/accelerate.yaml \
    train.py \
    --config ../configs/sft.conf

Inference

To inference the model using our self-backtracking method, you can run the following command:

CUDA_VISIBLE_DEVICES=0 python eval_search.py \
    --num 5000 \
    --ckpt [your_model_ckpt] \
    --data [val/val_new] \
    --decoder self_backtrack \
    --b 1 \
    --n 32

--ckpt defaults to yangxw/Llama-3.2-1B-countdown-backtrack. You can use our trained model available on Hugging Face.

Self-Improvement

To further improve the model, you can run the following command:

CUDA_VISIBLE_DEVICES=0 python train_self_improvement.py \
    --num 5000 \
    --past_model [your_model_ckpt] \
    --data [val/val_new]

Results

Results

Citation

If you use this work, please cite it as follows:

@article{selfbacktracking,
  title={Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models},
  author={Xiao-Wen Yang and Xuan-Yi Zhu and Wen-Da Wei and Ding-Chu Zhang and Jie-Jing Shao and Zhi Zhou and Lan-Zhe Guo and Yu-Feng Li},
  journal={arXiv preprint arXiv:2502.04404},
  year={2025}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages