-
Notifications
You must be signed in to change notification settings - Fork 2.5k
feat: standardize devices parameter and device initialization #3062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
haystack/document_stores/memory.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would consider changing the doc string to something like List of torch devices (e.g. cuda, cpu, mps) to limit inference to certain devices ... since a torch.device does not have to be a GPU. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I just read the docstring for initialize_device_settings and see that devices is only used if use_cuda is True so I think we can leave this unchanged. Perhaps add a sentence explaining that this list is only used if use_gpu is set to True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, excellent idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this one but it doesn't seem like it is going to work out of the box. For Tutorial 3, during pipeline run I get the following exception:
File "/Users/vblagoje/workspace/haystack/haystack/pipelines/base.py", line 512, in run
node_output, stream_id = self._run_node(node_id, node_input)
File "/Users/vblagoje/workspace/haystack/haystack/pipelines/base.py", line 439, in _run_node
return self.graph.nodes[node_id]["component"]._dispatch_run(**node_input)
File "/Users/vblagoje/workspace/haystack/haystack/nodes/base.py", line 201, in _dispatch_run
return self._dispatch_run_general(self.run, **kwargs)
File "/Users/vblagoje/workspace/haystack/haystack/nodes/base.py", line 245, in _dispatch_run_general
output, stream = run_method(**run_inputs, **run_params)
File "/Users/vblagoje/workspace/haystack/haystack/nodes/reader/base.py", line 84, in run
results = predict(query=query, documents=documents, top_k=top_k)
File "/Users/vblagoje/workspace/haystack/haystack/nodes/reader/base.py", line 145, in wrapper
ret = fn(*args, **kwargs)
File "/Users/vblagoje/workspace/haystack/haystack/nodes/reader/farm.py", line 867, in predict
predictions = self.inferencer.inference_from_objects(
File "/Users/vblagoje/workspace/haystack/haystack/modeling/infer.py", line 596, in inference_from_objects
return self.inference_from_dicts(
File "/Users/vblagoje/workspace/haystack/haystack/modeling/infer.py", line 578, in inference_from_dicts
return Inferencer.inference_from_dicts(
File "/Users/vblagoje/workspace/haystack/haystack/modeling/infer.py", line 349, in inference_from_dicts
return list(predictions)
File "/Users/vblagoje/workspace/haystack/haystack/modeling/infer.py", line 424, in _inference_with_multiprocessing
predictions = self._get_predictions_and_aggregate(dataset, tensor_names, baskets)
File "/Users/vblagoje/workspace/haystack/haystack/modeling/infer.py", line 512, in _get_predictions_and_aggregate
logits = self.model.forward(
File "/Users/vblagoje/workspace/haystack/haystack/modeling/model/adaptive_model.py", line 484, in forward
output_tuple = self.language_model.forward(
File "/Users/vblagoje/workspace/haystack/haystack/modeling/model/language_model.py", line 364, in forward
return self.model(**params, return_dict=return_dict)
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/transformers/models/roberta/modeling_roberta.py", line 841, in forward
embedding_output = self.embeddings(
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/transformers/models/roberta/modeling_roberta.py", line 128, in forward
inputs_embeds = self.word_embeddings(input_ids)
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 160, in forward
return F.embedding(
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/torch/nn/functional.py", line 2206, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Placeholder storage has not been allocated on MPS device!
And yes, I have initialised FarmReader with:
reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2", devices=[torch.device("mps")])
Do you @sjrl have M1/M2 silicone to try it out as well? Use vblagoje:t_fix_devices branch, of course
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @vblagoje what version of PyTorch do you have installed? It looks like this issue could be related to the issue described here: https://discuss.pytorch.org/t/torch-embedding-fails-with-runtimeerror-placeholder-storage-has-not-been-allocated-on-mps-device/152124/6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I traced this issue to QA inferencer class constructor not passing devices parameter.
Nice, great catch!
I'll add the devices parameter in infer.py but we'll have to cleanup it up soon after we integrate this PR.
What cleanup do you mean exactly here?
But even with this fix my kernel dies due to some MPS torch unsupported ops.
Does this mean we need to wait on PyTorch to support more ops with MPS before we can use a FARMReader on MPS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleanup related to multiprocessing. I'll create an issue about it today. Yes, unfortunately, I was not able to run FarmReader. Here is the error:
NotImplementedError: The operator 'aten::cumsum.out' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
Using CPU as fallback blew up the kernel. Seems risky even if it worked. So it seems like we'll have to wait a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can like/vote for the comments mentioning aten::cumsum.out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using CPU as fallback blew up the kernel. Seems risky even if it worked. So it seems like we'll have to wait a bit more.
It seems like the fallback only works if you build pytorch from source pytorch/pytorch#77764 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also a separate issue for aten::cumsum.out in PyTorch here: pytorch/pytorch#79112
haystack/modeling/utils.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could consider replacing this if statement with any(isinstance(x, str) for x in devices) which will check if any of the devices provided are of type string. For example, if the user happens to provide [torch.device('cuda:0'), 'cuda:1'] then we would still construct a list of only torch.device types.
This will work because torch.device(torch.device('cuda')) still returns torch.device('cuda') so there is no danger there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point, thank you @sjrl
haystack/document_stores/memory.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
haystack/document_stores/memory.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Unused if `use_gpu` is False. | |
| This parameter is not used if you set `use_gpu` to False. |
haystack/modeling/infer.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change it to:
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change to:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
haystack/nodes/reader/table.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes @agnieszka-m, but see #3062 (comment) for the wording. We shouldn't say GPUs as we are implying that only GPUs are the devices that can be used - but any device available on the machine and addressable from torch device (cuda, cpu, mps, xla etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply what it is now List of torch devices (e.g. cuda, cpu, mps) to limit inference to certain devices and not use all available ones (e.g. [torch.device('cuda:0')]). Unused if use_gpu is False. How would you reword it @agnieszka-m ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about List of torch devices (e.g. cuda, cpu, mps) to limit inference to specific devices. A list containing torch device objects and/or strings is supported (e.g. [torch.device('cuda:0'), "mps", "cuda:1"]). Specifying use_gpu=False cancels the devices parameter and falls back to a single cpu device. @agnieszka-m @sjrl ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! I would only suggest a few edits
List of torch devices (e.g. cuda, cpu, mps) to limit inference to specific devices.
A list containing torch device objects and/or strings is supported (For example [torch.device('cuda:0'), "mps", "cuda:1"]).
When specifying `use_gpu=False` the devices parameter is not used and a single cpu device is used for inference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is perfect, I'll go with @sjrl version @agnieszka-m
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
|
@sjrl, this one is ready for review now. cc @bogdankostic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the missing docstring for the progress_bar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand here it looks like only one device can be used to load the _Text2SpeechModel. If that is True should we enforce that devices can only have a length of 1 using an assert statement? Otherwise users might be confused as to why only one device is used if they provide multiple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more on this I wonder if it would be better to change the parameter from devices to device and make it type Union[str, torch.device] (so removing the List aspect altogether). Then we could just make sure to pass device as a list to initialize_device_settings(). For example,
resolved_devices, _ = initialize_device_settings(devices=[device], use_cuda=use_gpu, multi_gpu=False)What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this as well. But is it possible that in the future there'll be support for multiple devices for components? The best trade-off seems to be to allow a list but to log a warning that if multiple devices are passed the first on is selected. We can do this in all cases where only one device is currently supported. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that sounds good to me. I like the idea of just logging a warning, but having the code still run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that only cuda or cpu is supported for this document classifier? If so we may want to update the docstring to say that mps isn't supported for this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow good catch. No reason why not support mps here. I'll remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line looks to be present in quite a few places throughout Haystack just as a heads up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, actually, right now pipelines do not support devices other than cuda or cpu. HF unfortunately uses: torch.device("cpu" if device < 0 else f"cuda:{device}")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's frustrating. They have comment related to this issue here: huggingface/transformers#17342 (comment) Do you know if this line from HF torch.device("cpu" if device < 0 else f"cuda:{device}") has not been updated yet? If so we could think about opening an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh it appears this has been fixed in the current main branch of HF https://github.com/huggingface/transformers/blob/d90a36d192e2981a41122c30a765c63158dd0557/src/transformers/pipelines/base.py#L763-L773
We will need to wait until they release a new version of the Transformers library I think before we can add support for mps here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: good news @sjrl HF will support MPS for pipeline in the next minor release. See huggingface/transformers#18494 for more details
haystack/nodes/extractor/entity.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, consider adding an assert here for devices to be length 1 since we only support a single device for the model to be loaded onto.
Additionally below (on line 58) we have the line device=0 if self.devices[0].type == "cuda" else -1 which makes me concerned that mps would not be supported for loading this model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, not supported on mps right now. Will make notes for the components where only cpu and cuda are supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a docstring for use_gpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding assert for devices to be of length 1 since we do not support multi GPU loading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, not assert but actual check as assert can be removed during code optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above about whether mps would be supported here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above for adding assert to devices being length 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate doc string of devices. It appears again a few lines down. Please delete one.
bogdankostic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Bogdan. It will be supported 4.22.2 which should be released imminently. After we bump to that HF release we'll integrate this change. Hopefully we can release it in 1.8. We'll see. Anyways, no rush to integrate this one until HF upgrade and thorough test on mps devices as well. |
|
@sjrl @bogdankostic please have a quick look at the last few commits and the changes made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to resolved_devices[0] to grab first device
|
Looks good! One last comment. Also, I would consider adding a task to the description to remind us to pin the HF library to the new version in our |
|
Looks also good for me! Looking forward to using this soon 🚀 |
|
HF released transformers v4.21.2 which includes pipeline fixes we were waiting for. Can we pin this transformers release for our 1.8 release @julian-risch @masci? |
Yes, sounds good. What you could do as preparation now already is to create a PR that pins the version to 4.21.2 and check whether any tests are failing, for example due to deprecated/discontinued features or breaking changes. Further, you can already have a look at the changelog and check for any changes that affect Haystack and the jump from the currently pinned version 4.20.1 to 4.21.2. If there are any relevant changes, please add them to the description of that PR. |
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
Although torch mps backend still does not have some crucial operators implemented and therefore we can not yet use Haystack on Apple Silicone - I think we should integrate this PR now. We are still early on our path to 1.9 release and we can catch potential issues sooner. I am mostly concerned about subtle bugs that might not be visible to our test suite. Once torch implements these operators we need we can try out Haystack on Apple Silicone and hopefully it will work. That way we'll have support in 1.9 release and not later. LMK what you guys think @masci @julian-risch @ZanSara @brandenchan @bogdankostic |
sjrl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good! Thanks for hunting down all places in Haystack that uses gpus!
|
@agnieszka-m , I think we addressed the issues you raised. If so, would you please approve as well - can't merge without your approval |
Related Issues
Proposed Changes:
initialize_device_settingshelper functionHow did you test it?
Not tested yet - in preparation
Todo
Update to transformers v4.21.2
Run test suite on Apple Silicone using devices parameter in components where appropriate