-
Notifications
You must be signed in to change notification settings - Fork 31k
Add T5 to docs #3461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
LysandreJik
merged 11 commits into
huggingface:master
from
patrickvonplaten:t5_documentation
Mar 27, 2020
Merged
Add T5 to docs #3461
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
103ccd3
add t5 docs basis
patrickvonplaten cee6d22
improve docs
patrickvonplaten 425320f
add t5 docs
patrickvonplaten 46dec1d
improve t5 docstring
patrickvonplaten 01f62a4
add t5 tokenizer docstring
patrickvonplaten 5406717
finish docstring
patrickvonplaten c464697
make style
patrickvonplaten 617900c
add pretrained models
patrickvonplaten e56cf8c
correct typo
patrickvonplaten 2985c79
make examples work
patrickvonplaten 0de7b45
finalize docs
patrickvonplaten File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| T5 | ||
| ---------------------------------------------------- | ||
| **DISCLAIMER:** This model is still a work in progress, if you see something strange, | ||
| file a `Github Issue <https://github.com/huggingface/transformers/issues/new?assignees=&labels=&template=bug-report.md&title>`_ | ||
|
|
||
| Overview | ||
| ~~~~~ | ||
| The T5 model was presented in `Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer <https://arxiv.org/pdf/1910.10683.pdf>`_ by Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu in | ||
| Here the abstract: | ||
|
|
||
| *Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. | ||
| In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. | ||
| Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. | ||
| By combining the insights from our exploration with scale and our new "Colossal Clean Crawled Corpus", we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. | ||
| To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.* | ||
|
|
||
| The Authors' code can be found `here <https://github.com/google-research/text-to-text-transfer-transformer>`_ . | ||
|
|
||
| Tips | ||
| ~~~~~~~~~~~~~~~~~~~~ | ||
| - T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised | ||
| and supervised tasks and which each task is cast as a sequence to sequence task. | ||
| Therefore T5 works well on a variety of tasks out-of-the-box by prepending a different prefix to the input corresponding to each task, e.g.: for translation: *translate English to German: ..., summarize: ...*. | ||
| For more information about the which prefix to use, it is easiest to look into Appendix D of the `paper <https://arxiv.org/pdf/1910.10683.pdf>`_ . | ||
| - For sequence to sequence generation, it is recommended to use ``T5ForConditionalGeneration.generate()``. The method takes care of feeding the encoded input via cross-attention layers to the decoder and auto-regressively generating the decoder output. | ||
| - T5 uses relative scalar embeddings. Encoder input padding can be done on the left and on the right. | ||
|
|
||
|
|
||
| T5Config | ||
| ~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| .. autoclass:: transformers.T5Config | ||
| :members: | ||
|
|
||
|
|
||
| T5Tokenizer | ||
| ~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| .. autoclass:: transformers.T5Tokenizer | ||
| :members: build_inputs_with_special_tokens, get_special_tokens_mask, | ||
| create_token_type_ids_from_sequences, save_vocabulary | ||
|
|
||
|
|
||
| T5Model | ||
| ~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| .. autoclass:: transformers.T5Model | ||
| :members: | ||
|
|
||
|
|
||
| T5ForConditionalGeneration | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| .. autoclass:: transformers.T5ForConditionalGeneration | ||
| :members: | ||
|
|
||
|
|
||
| TFT5Model | ||
| ~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| .. autoclass:: transformers.TFT5Model | ||
| :members: | ||
|
|
||
|
|
||
| TFT5ForConditionalGeneration | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| .. autoclass:: transformers.TFT5ForConditionalGeneration | ||
| :members: | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -72,6 +72,10 @@ | |
| Mask to avoid performing attention on padding token indices in input_ids. | ||
| Mask values selected in ``[0, 1]``: | ||
| ``1`` for tokens that are NOT MASKED, ``0`` for MASKED tokens. | ||
|
||
| encoder_outputs (tuple(:obj:`tuple(torch.FloatTensor)`, `optional`, defaults to :obj:`None`): | ||
| Tuple consists of (`last_hidden_state`, `optional`: `hidden_states`, `optional`: `attentions`) | ||
| `last_hidden_state` of shape :obj:`(batch_size, sequence_length, hidden_size)`, `optional`, defaults to :obj:`None`) is a sequence of hidden-states at the output of the last layer of the encoder. | ||
| Used in the cross-attention of the decoder. | ||
| decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`, defaults to :obj:`None`): | ||
| Provide for translation and summarization training. By default, the model will create this tensor by shifting the input_ids right, following the paper. | ||
| decoder_attention_mask (:obj:`torch.BoolTensor` of shape :obj:`(batch_size, tgt_seq_len)`, `optional`, defaults to :obj:`None`): | ||
|
|
@@ -972,7 +976,7 @@ def forward( | |
| Returns: | ||
| :obj:`tuple(torch.FloatTensor)` comprising various elements depending on the configuration (:class:`~transformers.BartConfig`) and inputs: | ||
| loss (:obj:`torch.FloatTensor` of shape :obj:`(1,)`, `optional`, returned when :obj:`label` is provided): | ||
| Classification loss (cross entropy) | ||
| Classification loss (cross entropy) | ||
| logits (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, config.num_labels)`): | ||
| Classification (or regression if config.num_labels==1) scores (before SoftMax). | ||
| hidden_states (:obj:`tuple(torch.FloatTensor)`, `optional`, returned when ``config.output_hidden_states=True``): | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thomwolf @LysandreJik - Could you maybe check the Tips?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the tips!