GitHub - scalable-model-editing/encore

Installation

We work off of the MEMIT codebase, so we'll reference the same installation procedures here: "We recommend conda for managing Python, CUDA, and PyTorch; pip is for everything else. To get started, simply install conda and run:

CONDA_HOME=$CONDA_HOME ./scripts/setup_conda.sh

$CONDA_HOME should be the path to your conda installation, e.g., ~/miniconda3."

Running the experiments

Before being able to run ENCORE, some pre-computation statistics need to be present in the repository. GPT2-XL will automatically get downloaded, but we provide Llama2-7B and Llama3-8B stats on this link - Google Drive . Unzip the stats folder and place it inside the /data directory.

To evaluate ENCORE, run the following command:

python experiments/evaluate_unified_editing.py \
--alg_name=ENCORE \
--num_edits=100 \
--model_name=gpt2-xl \
--hparams_fname=gpt2-xl.json \
--ds_name=mcf

The above script can also be used to run ROME and MEMIT from the same file. We have a common underlying code-base for calculating the key and value vectors. The update equations for ROME, MEMIT and EMMET are in the file unified_editing/unified_main.py

Before any experiment is run, there might be need to update sys.path.append('/path/to/encore') in the files 'experiments/evaluate_unified_editing.py', 'experiments/summarize.py' and 'experiments/py/eval_utils_zsre.py'

Downstream Evaluation

downstream_tasks specifies the downstream tasks to run. Available tasks: nli,rte,mrpc,sentiment_analysis,dialogue,nli,cola,sst

number_of_few_shots is the number of few shots for each downstream task. Specify the number of few shots for each task, separated by commas. number_of_few_shots must be same length as downstream_tasks. Its default value is 0 when the flag is not provided

number_of_tests is the number of tests for all downstream tasks. The default to using the entire test dataset if the flag is not provided

Example: To run nli, sst and mmlu with 2,3,3 few shots respectively, run the following command:

python experiments/evaluate_unified_editing.py \
--alg_name=ENCORE \
--num_edits=100 \
--model_name=gpt2-xl \
--hparams_fname=gpt2-xl.json \
--ds_name=mcf \
--do_downstream_eval=True \
--downstream_eval_steps=100 \
--downstream_tasks=nli,sst,mmlu,mrpc,cola,rte \
--number_of_few_shots=4,4,4,4,4,4 \
--number_of_tests=100

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
AlphaEdit		AlphaEdit
baselines		baselines
data		data
downstream_eval		downstream_eval
dsets		dsets
emmet		emmet
encore		encore
experiments		experiments
glue_eval		glue_eval
hparams		hparams
memit		memit
rome		rome
scripts		scripts
util		util
README.md		README.md
create_samples_cf.py		create_samples_cf.py
create_samples_zsre.py		create_samples_zsre.py
current_edit_scores.py		current_edit_scores.py
globals.yml		globals.yml
read_values.py		read_values.py
scaling_curves.sh		scaling_curves.sh
useful_functions.py		useful_functions.py
zsre_evals.sh		zsre_evals.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Installation

Running the experiments

Downstream Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

scalable-model-editing/encore

Folders and files

Latest commit

History

Repository files navigation

Installation

Running the experiments

Downstream Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages