-
Notifications
You must be signed in to change notification settings - Fork 30.7k
Description
Hi!
In the config
definition https://github.com/huggingface/pytorch-pretrained-BERT/blob/21f0196412115876da1c38652d22d1f7a14b36ff/pytorch_pretrained_bert/modeling.py#L848
in the Example usage of BertForSequenceClassification
in modeling.py
, there's things I don't understand:
-
vocab_size
in not an acceptable parameter name, by looking at theBertConfig
class definition https://github.com/huggingface/pytorch-pretrained-BERT/blob/21f0196412115876da1c38652d22d1f7a14b36ff/pytorch_pretrained_bert/modeling.py#L70 -
even by changing
vocab_size
intovocab_size_or_config_json_file
, for the choice of the other params given in the example i.e.
vocab_size=32000, hidden_size=512, num_hidden_layers=8, num_attention_heads=6, intermediate_size=1024
I get:
ValueError: The hidden size (512) is not a multiple of the number of attention heads (6)
I think that something similar may be true for the other classes as well,BertForQuestionAnswering
,BertForNextSentencePrediction
, etc.
Am I missing something?