NER: some issues in PyTorch Lightning example

Hi,

I wanted to try out the new NER example script (`./ner/run_pl_ner.py`) that uses PyTorch Lightning.

Here are some bugs that I've found:

Dataset preparation method is not called. Usually, InputBatch batches or input features are written and stored in a file. However, the `prepare_data()` [1] method is not called and no input features are written. I fixed that adding this method to the `train_dataloader()` [2] function, but I'm not sure if it's the right place. 

Model training will work then.

Evaluation is currently not working correctly. The checkpoint output file name is:

```bash
# ls
'checkpointepoch=0.ckpt'  'checkpointepoch=1.ckpt'  'checkpointepoch=2.ckpt'
```

so the pattern `checkpointepoch=<number_epoch>.ckpt` is used, whereas the main script expects an output checkpoint pattern of `checkpoint_<number_epoch>.ckpt` [3]

[1] https://github.com/huggingface/transformers/blob/master/examples/ner/run_pl_ner.py#L56-L80
[2] https://github.com/huggingface/transformers/blob/master/examples/ner/transformer_base.py#L126-L139
[3] https://github.com/huggingface/transformers/blob/master/examples/ner/run_pl_ner.py#L220

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NER: some issues in PyTorch Lightning example #3159

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

NER: some issues in PyTorch Lightning example #3159

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions