-
Notifications
You must be signed in to change notification settings - Fork 31.2k
Description
Hi,
I wanted to try out the new NER example script (./ner/run_pl_ner.py) that uses PyTorch Lightning.
Here are some bugs that I've found:
Dataset preparation method is not called. Usually, InputBatch batches or input features are written and stored in a file. However, the prepare_data() [1] method is not called and no input features are written. I fixed that adding this method to the train_dataloader() [2] function, but I'm not sure if it's the right place.
Model training will work then.
Evaluation is currently not working correctly. The checkpoint output file name is:
# ls
'checkpointepoch=0.ckpt' 'checkpointepoch=1.ckpt' 'checkpointepoch=2.ckpt'so the pattern checkpointepoch=<number_epoch>.ckpt is used, whereas the main script expects an output checkpoint pattern of checkpoint_<number_epoch>.ckpt [3]
[1] https://github.com/huggingface/transformers/blob/master/examples/ner/run_pl_ner.py#L56-L80
[2] https://github.com/huggingface/transformers/blob/master/examples/ner/transformer_base.py#L126-L139
[3] https://github.com/huggingface/transformers/blob/master/examples/ner/run_pl_ner.py#L220