Skip to content

NER: some issues in PyTorch Lightning example #3159

@stefan-it

Description

@stefan-it

Hi,

I wanted to try out the new NER example script (./ner/run_pl_ner.py) that uses PyTorch Lightning.

Here are some bugs that I've found:

Dataset preparation method is not called. Usually, InputBatch batches or input features are written and stored in a file. However, the prepare_data() [1] method is not called and no input features are written. I fixed that adding this method to the train_dataloader() [2] function, but I'm not sure if it's the right place.

Model training will work then.

Evaluation is currently not working correctly. The checkpoint output file name is:

# ls
'checkpointepoch=0.ckpt'  'checkpointepoch=1.ckpt'  'checkpointepoch=2.ckpt'

so the pattern checkpointepoch=<number_epoch>.ckpt is used, whereas the main script expects an output checkpoint pattern of checkpoint_<number_epoch>.ckpt [3]

[1] https://github.com/huggingface/transformers/blob/master/examples/ner/run_pl_ner.py#L56-L80
[2] https://github.com/huggingface/transformers/blob/master/examples/ner/transformer_base.py#L126-L139
[3] https://github.com/huggingface/transformers/blob/master/examples/ner/run_pl_ner.py#L220

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions