Skip to content

Conversation

@patrickvonplaten
Copy link
Contributor

This PR:

  • adds T5 to summarization piplines.
  • adds warnings and better defaults to Bart/T5 summarization
  • removes unnecessary assert in generate() function

@codecov-io
Copy link

codecov-io commented Mar 24, 2020

Codecov Report

Merging #3413 into master will increase coverage by 0.02%.
The diff coverage is 93.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3413      +/-   ##
==========================================
+ Coverage   77.56%   77.58%   +0.02%     
==========================================
  Files         100      100              
  Lines       16970    16993      +23     
==========================================
+ Hits        13162    13184      +22     
- Misses       3808     3809       +1     
Impacted Files Coverage Δ
src/transformers/modeling_utils.py 91.71% <ø> (-0.02%) ⬇️
src/transformers/pipelines.py 73.05% <93.10%> (+0.52%) ⬆️
src/transformers/modeling_tf_utils.py 84.44% <100.00%> (+0.52%) ⬆️
src/transformers/tokenization_t5.py 95.89% <100.00%> (+0.05%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e392ba6...23778d1. Read the comment docs.

Copy link
Contributor

@sshleifer sshleifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective, looks good. Would love test coverage to avoid breaking it accidentally!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in a previous PR, @thomwolf suggested that if a kwarg (like length_penalty) is on the config, it should only be exposed through generate_kwargs).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would favor to (re-)define the kwargs like max_length, min_length, do_sample, length_penalty in this function because the defaults in the config are defaults for generate() in general, but there are not good defaults for summarization. And to me pipelines is about easy-to-use so I think good general defaults for summarization should be defined here. Happy to discuss.

Copy link
Member

@thomwolf thomwolf Mar 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that's quite an interesting case for T5 versus Bart because in Bart the pretrained model are task-specific so it make sense to have the task-specific generation hyper-parameters in the config while for T5 the pretrained model is generic so it would make more sense to have a way to specify task specific generation HP in the config.

In general, I'm not a fan of having model-specific hyper-parameters in general classes like pipelines.

What I would propose is maybe the following for T5 (open to discussion):

  • have generic generation HP in the config like now
  • also have a dict of class specific generation HP in the config which can override the generic HP.

Ex of config dic:

{
....
max_length: 100,  # generic generation HP
length_penalty: 1.0,
task_specific_generation_params: {
     "summarization": {  # task id (e.g. name of the pipeline?)
          max_length: 140,
          length_penalty: 2.0
      },
     "translation_en_to_de": {
          max_length: 160,
          length_penalty: 3.0
      },
    },
}

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I like that idea!
Two questions:

  • Should we enforce that all parameters defined in task_specific_generation_params have to be already defined so that they can only override?
  • Do we allow "summarization" and "translation" if the config has no task_specific_generation_params defined?

@sshleifer sshleifer changed the title Add t5 to pipelines Add t5 to pipeline(task='summarization') Mar 24, 2020
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we would not optionally use pad_to_max_length here? Was not sure if I can add it, but for summarization when using a batched input, this is necessary. @LysandreJik @thomwolf @mfuntowicz

Copy link
Member

@thomwolf thomwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. A few tweaks regarding handling task-specific generation HPs if you can do it (see my comments).

Comment on lines +1232 to +1243
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for now but we'll probably refactor this in the future to have framework-agnostic ensure_tensor_on_device and get_tensor_length methods (maybe have a base class and framework-specific derived class for instance).

Copy link
Member

@thomwolf thomwolf Mar 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that's quite an interesting case for T5 versus Bart because in Bart the pretrained model are task-specific so it make sense to have the task-specific generation hyper-parameters in the config while for T5 the pretrained model is generic so it would make more sense to have a way to specify task specific generation HP in the config.

In general, I'm not a fan of having model-specific hyper-parameters in general classes like pipelines.

What I would propose is maybe the following for T5 (open to discussion):

  • have generic generation HP in the config like now
  • also have a dict of class specific generation HP in the config which can override the generic HP.

Ex of config dic:

{
....
max_length: 100,  # generic generation HP
length_penalty: 1.0,
task_specific_generation_params: {
     "summarization": {  # task id (e.g. name of the pipeline?)
          max_length: 140,
          length_penalty: 2.0
      },
     "translation_en_to_de": {
          max_length: 160,
          length_penalty: 3.0
      },
    },
}

What do you think?

args = ["input_ids", "attention_mask"]

if not isinstance(self.model.config, (DistilBertConfig, XLMConfig, RobertaConfig, BartConfig)):
if not isinstance(self.model.config, (DistilBertConfig, XLMConfig, RobertaConfig, BartConfig, T5Config)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, note that we can now remove the inputs_for_model from pipelines since #3116
Let's do it in another PR cleaning up pipelines later.

@thomwolf thomwolf merged commit 9c683ef into huggingface:master Mar 26, 2020
@patrickvonplaten patrickvonplaten deleted the add_t5_to_pipelines branch March 26, 2020 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants