Uni3R: Unified 3D Reconstruction and Semantic Understanding via Generalizable Gaussian Splatting from Unposed Multi-View Images
Xiangyu Sun*, Haoyi Jiang*, Liu Liu, Seungtae Nam, Gyeongjin Kang, Xinjie Wang,
Wei Sui, Zhizhong Su, Wenyu Liu, Xinggang Wang, Eunbyung Park
-
Download repo:
git clone --recurse-submodules https://github.com/HorizonRobotics/Uni3R
-
Create and activate conda environment:
conda create -n uni3r python=3.10 conda activate uni3r
-
Install PyTorch and related packages:
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia -y conda install pytorch-cluster pytorch-scatter pytorch-sparse -c pyg -y
-
Install other Python dependencies:
pip install -r requirements.txt pip install flash-attn --no-build-isolation
-
Install 3D Gaussian Splatting modules:
pip install submodules/3d_gaussian_splatting/diff-gaussian-rasterization pip install submodules/3d_gaussian_splatting/simple-knn
-
Install OpenAI CLIP:
pip install git+https://github.com/openai/CLIP.git
-
Build croco model:
cd submodules/dust3r/croco/models/curope python setup.py build_ext --inplace cd ../../../../..
-
Download pre-trained models:
The following three model weights need to be downloaded:
# 1. Create directory for checkpoints mkdir -p checkpoints/pretrained_models # 2. LSEG demo model weights gdown 1FTuHY1xPUkM-5gaDtMfgCl3D0gR89WV7 -O checkpoints/pretrained_models/demo_e200.ckpt # 3. Uni3R final checkpoint TODO
-
For training: The model can be trained on ScanNet and ScanNet++ datasets.
- Both datasets require signing agreements to access
- Detailed data preparation instructions are available in data_process/data.md
-
For testing: Refer to data_process/data.md for details on the test dataset.
After preparing the datasets, you can train the model using the following command:
bash scripts/train.sh
The training results will be saved to SAVE_DIR
. By default, it is set to checkpoints/output
.
Optional parameters in scripts/train.sh
:
# Directory to save training outputs
--output_dir "checkpoints/output"
Please run these scripts to do evaluation on ScanNet Dataset, including 2, 8 and 16 views.
bash scripts/test.sh
bash scripts/test_8views.sh
bash scripts/test_16views.sh
- Release inference code.
- Release 2, 8 and 16 views checkpoints.
- Release the training code w/ geometric loss.
- Verify the 2 and multi-views training code.
This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!
- Gaussian-Splatting and diff-gaussian-rasterization
- DUSt3R
- Language-Driven Semantic Segmentation (LSeg)
- LSM
If you find our work useful in your research, please consider giving a star ⭐ and citing the following paper 📝.
@misc{sun2025uni3runified3dreconstruction,
title={Uni3R: Unified 3D Reconstruction and Semantic Understanding via Generalizable Gaussian Splatting from Unposed Multi-View Images},
author={Xiangyu Sun and Haoyi Jiang and Liu Liu and Seungtae Nam and Gyeongjin Kang and Xinjie Wang and Wei Sui and Zhizhong Su and Wenyu Liu and Xinggang Wang and Eunbyung Park},
year={2025},
eprint={2508.03643},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2508.03643},
}