Skip to content

why squad.py did not reproduce squad1.1 report result? #4301

@yyHaker

Description

@yyHaker

📚 Migration

Information

Model I am using (Bert, XLNet ...):

Language I am using the model on (English...):

The problem arises when using:

  • the official example scripts: (give details below)
    examples/question-answering/run_squad.py
  • my own modified scripts: (give details below)
    '''
    CUDA_VISIBLE_DEVICES=5 python examples/question-answering//run_squad.py
    --model_type bert
    --model_name_or_path bert-large-uncased-whole-word-masking
    --do_train
    --do_eval
    --data_dir EKMRC/data/squad1.1
    --train_file train-v1.1.json
    --predict_file dev-v1.1.json
    --per_gpu_eval_batch_size=4
    --per_gpu_train_batch_size=4
    --gradient_accumulation_steps=6
    --save_steps 3682
    --learning_rate 3e-5
    --num_train_epochs 2
    --max_seq_length 384
    --doc_stride 128
    --output_dir result/debug_squad/wwm_uncased_bert_large_finetuned_squad/
    --overwrite_output_dir
    '''

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)

Details

But I did not reproduce the result reported, the repository say get result bellow:

python $SQUAD_DIR/evaluate-v1.1.py $SQUAD_DIR/dev-v1.1.json ../models/wwm_uncased_finetuned_squad/predictions.json
{"exact_match": 86.91579943235573, "f1": 93.1532499015869}

my result is below:

python $SQUAD_DIR/evaluate-v1.1.py $SQUAD_DIR/dev-v1.1.json ../models/wwm_uncased_finetuned_squad/predictions.json
{"exact_match": 81.03, "f1": 88.02}

Environment info

  • transformers version:
  • Platform: Linux gpu19 3.10.0-1062.4.1.el7.x86_64 Create DataParallel model if several GPUs #1 SMP Fri Oct 18 17:15:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Python version: python3.6
  • PyTorch version (GPU?): 1.4.0
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: parallel
  • pytorch-transformers or pytorch-pretrained-bert version (or branch):
    current version of transformers.

Checklist

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions