Multi-LLM AB-MCTS for ARC-AGI-2

This repository provides an implementation for solving the ARC-AGI-2 public evaluation set with Multi-LLM AB-MCTS, leveraging frontier LLMs. It is powered by TreeQuest. See our blog post and our paper for details.

Installation

Clone the repository with its submodules:

git clone --recurse-submodules https://github.com/SakanaAI/ab-mcts-arc2.git
cd ab-mcts-arc2

Install parallel, graphviz and graphviz-dev:

For Mac users,
```
brew install parallel graphviz
```
For Linux users, use your distribution's package manager, e.g.,
```
sudo apt install parallel graphviz graphviz-dev
```
Install dependencies using uv:
```
uv sync
```
If you are a Mac user and encountered an error in pygraphviz installation, please set env vars accordingly:
```
CFLAGS="-I $(brew --prefix graphviz)/include" LDFLAGS="-L $(brew --prefix graphviz)/lib" uv sync
```

Running Experiments

The experiments can be run using the run_experiments.sh script, which executes the ARC-AGI-2 problems in parallel.

./scripts/run_experiments.sh

Key parameters (You can modify these parameters directly in run_experiments.sh):

EXP_ID: Experiment name you can configure to distinguish several experiements
MAX_NUM_NODES: Maximum number of nodes to expand in the search tree
ALGO_CLASS_NAME: Algorithm class to use (default: ABMCTSA)
DIST_TYPE: Distribution type for the algorithm (default: beta)
N_JOBS: Number of parallel jobs to run
INDICES_FILE: Path to a txt file which lists task ids to be solved in this experiemnt

Each experiment attempts to solve a problem from the ARC-AGI-2 benchmark. It uses LLMs to generate solutions, which will later be evaluated to get the final results. You can see the logs in the outputs directory.

Running Evaluation

After running the experiments, you can evaluate the results using:

./scripts/run_eval.sh

This script:

Processes the experimental results using eval/proc_results.py
Generates visualization plots using eval/visualize.py

The final result plots are generated in the outputs/plots directory.

LLM Configs

The LLM configuration file is located at experiments/arc2/configs/config.yaml. Multi-LLM AB-MCTS uses the LLMs listed in this file, applying the specified temperature setting for each.

Folder Layout

experiments/arc2
- run.py - The main script that leverages TreeQuest to generate answers for ARC-AGI-2 problems.
- prompt.py - Contains the prompts used to instruct LLMs to solve ARC-AGI-2 problems or refine existing answers based on feedback.
eval
- proc_results.py - An evaluation script that processes the generated answers to produce the final results.
- visualize.py - A visualization script for generating result plots.

Citation

If you use this code in your research, please cite our paper:

@article{inoue2025wider,
  title={Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search},
  author={Inoue, Yuichi and Misaki, Kou and Imajuku, Yuki and Kuroki, So and Nakamura, Taishi and Akiba, Takuya},
  journal={arXiv preprint arXiv:2503.04412},
  year={2025}
}

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
ARC-AGI-2 @ f3283f7		ARC-AGI-2 @ f3283f7
eval		eval
experiments/arc2		experiments/arc2
images		images
scripts		scripts
src/ab_mcts_arc2		src/ab_mcts_arc2
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-LLM AB-MCTS for ARC-AGI-2

Installation

Running Experiments

Running Evaluation

LLM Configs

Folder Layout

Citation

License

About

Uh oh!

Releases

Packages

Languages

License

SakanaAI/ab-mcts-arc2

Folders and files

Latest commit

History

Repository files navigation

Multi-LLM AB-MCTS for ARC-AGI-2

Installation

Running Experiments

Running Evaluation

LLM Configs

Folder Layout

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages