Overload pipeline to return the appropriate type for a task #26125

aliabid94 · 2023-09-12T22:42:04Z

Previously, if I use the pipeline method, e.g. transcriber = pipeline("automatic-speech-recognition", model="openai/whisper-base.en"), the type hinting assumes transcriber is a generic Pipeline object. This means I get no help from editor to get any of the documentation for the specific arguments I can pass to this type of pipeline, the AutomaticSpeechRecognitionPipeline. I have to open my browser and search the internet to see what inputs the pipeline accepts in what format, etc.
Fixed this by overloading the return type based on the task passed to the constructor. Makes everything much easier to use for our end users!

HuggingFaceDocBuilderDev · 2023-09-12T23:10:42Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

amyeroberts · 2023-09-13T13:13:49Z

cc @Narsil

Narsil · 2023-09-21T07:28:21Z

I do understand where this is coming from, and I do agree the pipeline object is quite complex, having some helps from LSP would be super nice !

I am a big hater of overload.
How do you know which call you are actually making ?
It causes many bugs where you're not calling what you're supposed to and you don't realize.

I would much more strongly advocate having independant functions, or much better yet IMO, simple low level code that would make everything obvious.

pipeline_kwargs = resolve_pipeline(....)
pipeline = AutomaticSpeechRecognition(**pipeline_kwargs)

Wouldn't transcriber: AutomaticSpeechRecognition = pipeline("automatic-speech-recognition", model="openai/whisper-base.en")
already solve most of your issues actually ?

aliabid94 · 2023-09-21T19:56:32Z

Wouldn't transcriber: AutomaticSpeechRecognition = pipeline("automatic-speech-recognition", model="openai/whisper-base.en")
already solve most of your issues actually ?

I had no idea beforehand that pipeline would return an AutomaticSpeechRecognition object though. Our users would not know this either. If pipeline is the core object that transformers runs all of its predictions through, I really would like to know through my IDE what arguments I can pass to it, in what format. Right now, it's Any, which means I need to use the internet to find out what I can pass to this function. I really think we should improve this developer experience!

I am a big hater of overload. How do you know which call you are actually making ?

Here we are only changing the signature, there is still only one function body. We can even add a test to make sure none of the overloads diverge from the main implementation. And the overloads can be put in another file / cleaned up, made this PR to start the discussion.

aliabid94 · 2023-09-21T19:58:08Z

I would much more strongly advocate having independant functions, or much better yet IMO, simple low level code that would make everything obvious.

The thing is, throughout our docs and existing code, the format that is used and encouraged is: transcriber = pipeline("automatic-speech-recognition", model="openai/whisper-base.en"). This provides a fix for this specific format that our users are already familiar with.

e.g.

ArthurZucker · 2023-09-22T00:54:36Z

I really think we should improve this developer experience!

➕ on this, unleashing the full potential of pipelines would be great and not requiring to go only is good for devs. But yes let's find a great solution that is easy to maintain!

aliabid94 · 2023-09-22T03:33:06Z

I could add CI tests that ensure that no overloaded pipeline signature ever diverges from the main pipeline signature, other than the task name argument, and that there is only one function body for this method. Happy to hear other ideas though!

aliabid94 · 2023-10-11T16:03:03Z

are there any other comments/suggestions @ArthurZucker @Narsil ?

ArthurZucker · 2023-10-13T13:15:49Z

I think I am fine with adding a check for this yes!

amyeroberts · 2023-11-09T12:18:21Z

Commenting here as there's a related open PR regarding #27275 and the use of overloads for types. Previously a similar proposal was very strongly rejected (comment and PR) and I agree with @Narsil's comments. I'm generally against introducing overloads.

However, this does seem to be an issue which is raised by many independent community members and finding an easy to maintain alternative we can agree on would be good. Perhaps we can open an issue to discuss addressing this more generally?

github-actions · 2023-12-29T08:06:25Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

overload pipeline

7adeae9

huggingface deleted a comment from github-actions bot Nov 8, 2023

amyeroberts mentioned this pull request Nov 9, 2023

VSCode pylance auto-completion for HfArgumentParser (limited support) #27275

Closed

5 tasks

huggingface deleted a comment from github-actions bot Dec 4, 2023

github-actions bot closed this Jan 6, 2024

ringohoffman mentioned this pull request Jun 5, 2024

Add overloads for PretrainedModel.from_pretrained #24035

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Overload pipeline to return the appropriate type for a task #26125

Overload pipeline to return the appropriate type for a task #26125

Uh oh!

aliabid94 commented Sep 12, 2023 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Sep 12, 2023

Uh oh!

amyeroberts commented Sep 13, 2023

Uh oh!

Narsil commented Sep 21, 2023

Uh oh!

aliabid94 commented Sep 21, 2023

Uh oh!

aliabid94 commented Sep 21, 2023 •

edited

Loading

Uh oh!

ArthurZucker commented Sep 22, 2023 •

edited

Loading

Uh oh!

aliabid94 commented Sep 22, 2023

Uh oh!

aliabid94 commented Oct 11, 2023

Uh oh!

ArthurZucker commented Oct 13, 2023

Uh oh!

amyeroberts commented Nov 9, 2023

Uh oh!

github-actions bot commented Dec 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Overload pipeline to return the appropriate type for a task #26125

Overload pipeline to return the appropriate type for a task #26125

Uh oh!

Conversation

aliabid94 commented Sep 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Sep 12, 2023

Uh oh!

amyeroberts commented Sep 13, 2023

Uh oh!

Narsil commented Sep 21, 2023

Uh oh!

aliabid94 commented Sep 21, 2023

Uh oh!

aliabid94 commented Sep 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArthurZucker commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aliabid94 commented Sep 22, 2023

Uh oh!

aliabid94 commented Oct 11, 2023

Uh oh!

ArthurZucker commented Oct 13, 2023

Uh oh!

amyeroberts commented Nov 9, 2023

Uh oh!

github-actions bot commented Dec 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

aliabid94 commented Sep 12, 2023 •

edited

Loading

aliabid94 commented Sep 21, 2023 •

edited

Loading

ArthurZucker commented Sep 22, 2023 •

edited

Loading