Skip to content

gems-uff/sbcr_study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Replication Package for "LLM-based vs. Search-based Merge Conflict Resolution: An Empirical Study of Competing Paradigms"

This repository contains all the necessary artifacts to replicate the experiments and analyses presented in our paper. It includes the source code for both the SBCR and MergeGen approaches, as well as the scripts required to prepare the datasets, run the experiments, and generate the results.


🔬 Replication Options

We provide two ways to engage with our research artifacts:

  1. Use Pre-processed Data (Recommended for Analysis): If you are primarily interested in analyzing our results or using the final datasets, you can download them directly from our archival repositories. This is the fastest way to get started.
  2. Full Replication (From Scratch): If you wish to replicate our entire experimental process, from data pre-processing to running the tools and analyzing the output, follow the detailed instructions below.

💾 Archival Repositories (Data and Results)

To facilitate reproducibility and further inspection, we have archived our datasets and full experimental results on FigShare.

  • Pre-processed Datasets: This repository contains the final, clean datasets used directly in our experiments. This is ideal for researchers who want to bypass the initial data pre-processing steps.

    Link: https://figshare.com/s/d196f4ccb3ef34d2e770

  • Full Experimental Results Archive: This repository contains a complete snapshot of our experimental run, including all intermediate files, execution logs, every candidate generated for each conflict, and the trained models produced by MergeGen. This is useful for a deep inspection of all generated artifacts.

    Link: https://figshare.com/s/b3cdd351d077a9b08121


⚙️ Full Replication Instructions (From Scratch)

Follow these steps to set up the environment and run the entire experimental pipeline.

Step 1: Environment Setup

First, create and activate a new Conda environment with the required dependencies.

# Create a new conda environment using Python 3.8
conda create -n sbcr_study python=3.8

# Activate the environment
conda activate sbcr_study

# Install the required packages
python -m pip install -r requirements.txt

Step 2: Data Preparation

Run the following script to download the original datasets and pre-process them into the format required for the experiments.

# This script will download and prepare the datasets
./prepare_dataset.sh

Step 3: Run the Experiments

This step involves training the MergeGen models and running both MergeGen and SBCR on the prepared datasets.

# 3.1 Train the MergeGen models for each dataset
# Note: This action can take a very long time, depending on your hardware.
./train_all_models.sh

# 3.2 Run MergeGen to generate resolution candidates for each conflict
./test_all_models.sh

# 3.3 Run the parameter tuning process for SBCR
./tunning_sbcr.sh

# 3.4 Run the final evaluation of SBCR with the tuned parameters
./evaluate_sbcr.sh

Step 4: Analyze the Results

After the experiments are complete, run the following scripts to extract the similarity scores and generate the statistics presented in the paper.

# 4.1 Extract similarities for the candidates generated by MergeGen
./extract_all_mergeGen_similarities.sh

# 4.2 Collect and summarize statistics for all datasets and results
./collect_dataset_stats.sh

The analyses notebooks are located in the analysis folder. They can be used to generate the figures and tables from the paper.


📜 Citation

If you use the artifacts from this repository in your research, please cite our paper (to appear).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published