Skip to content

Conversation

@patrickvonplaten
Copy link
Contributor

@patrickvonplaten patrickvonplaten commented Mar 29, 2020

  • Make decoder_input_ids optional when supplying lm_labels for T5ForConditionalGeneration
  • Add test

@patrickvonplaten patrickvonplaten changed the title [T5] make decoder input ids optional for t5 training [WIP] [T5] make decoder input ids optional for t5 training Mar 29, 2020
@codecov-io
Copy link

codecov-io commented Mar 30, 2020

Codecov Report

Merging #3521 into master will increase coverage by 0.00%.
The diff coverage is 95.23%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3521   +/-   ##
=======================================
  Coverage   77.80%   77.81%           
=======================================
  Files         100      100           
  Lines       17051    17069   +18     
=======================================
+ Hits        13267    13282   +15     
- Misses       3784     3787    +3     
Impacted Files Coverage Δ
src/transformers/modeling_tf_utils.py 88.15% <ø> (-0.18%) ⬇️
src/transformers/modeling_utils.py 91.81% <ø> (-0.14%) ⬇️
src/transformers/modeling_t5.py 81.79% <95.23%> (+0.50%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33ef700...168bed0. Read the comment docs.

@patrickvonplaten patrickvonplaten changed the title [WIP] [T5] make decoder input ids optional for t5 training [T5] make decoder input ids optional for t5 training Mar 30, 2020
@patrickvonplaten patrickvonplaten merged commit 75ec6c9 into huggingface:master Mar 30, 2020
@patrickvonplaten patrickvonplaten deleted the t5_easier_training branch March 30, 2020 12:21
# replace possible -100 values in lm_labels by `pad_token_id`
shifted_input_ids.masked_fill_(shifted_input_ids == -100, pad_token_id)

assert torch.all(shifted_input_ids >= 0).item(), "Verify that `lm_labels` has only positive values and -100"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

@mengyahuUSTC-PU
Copy link

@patrickvonplaten Hi Patrick! Could you tell me what is the difference between decoder_input_ids and lm_labels for T5ForConditionalGeneration? For context: I am using T5ForConditionalGeneration for paraphrase generation. I am checking this code:https://github.com/ramsrigouthamg/Paraphrase-any-question-with-T5-Text-To-Text-Transfer-Transformer-/blob/master/t5-pretrained-question-paraphraser.ipynb He uses lm_labels with decoder_attention_mask. Thanks in advance!

@wmmxk
Copy link

wmmxk commented Aug 8, 2020

@mengyahuUSTC-PU . When calling the forward() , decoder_input_ids is None as follows:

             outputs = self(
            input_ids=batch["source_ids"],
            attention_mask=batch["source_mask"],
            lm_labels=lm_labels,
            decoder_attention_mask=batch['target_mask']
        )

decode_input_ids is derived from lm_labels if decode_input_ids is None. decode_input_ids=

I was wondering in what case I need to feed decode_input_ids.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants