Skip to content

Conversation

liangfu
Copy link
Contributor

@liangfu liangfu commented Feb 6, 2024

This PR adds the setup steps in the document, in order to provide a general guide on prepare trn1/inf2 instances for inference with Neuron SDK.

To perform offline inference with Inferentia, the changes in PR #2569 are required.

@WoosukKwon WoosukKwon added the aws-neuron Related to AWS Inferentia & Trainium label Feb 6, 2024
@liangfu liangfu marked this pull request as ready for review February 6, 2024 05:44
Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the contribution!

@zhuohan123 zhuohan123 enabled auto-merge (squash) March 3, 2024 23:57
@zhuohan123 zhuohan123 merged commit d0fae88 into vllm-project:main Mar 4, 2024
dtransposed pushed a commit to afeldman-nm/vllm that referenced this pull request Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

aws-neuron Related to AWS Inferentia & Trainium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants