-
Notifications
You must be signed in to change notification settings - Fork 49
Merge Documentation changes to main for Launch #196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… main (#190) * Fix training test (#184) * Fix SDK training test: Add wait time before refresh * Fix training tests in canaries * Update logging information for submitting and deleting training job (#189) Co-authored-by: pintaoz <[email protected]> --------- Co-authored-by: Zhaoqi <[email protected]> Co-authored-by: pintaoz-aws <[email protected]> Co-authored-by: pintaoz <[email protected]>
Co-authored-by: Roja Reddy Sareddy <[email protected]>
* Fix training test (#184) * Fix SDK training test: Add wait time before refresh * Fix training tests in canaries * Update logging information for submitting and deleting training job (#189) Co-authored-by: pintaoz <[email protected]> --------- Co-authored-by: Zhaoqi <[email protected]> Co-authored-by: pintaoz-aws <[email protected]> Co-authored-by: pintaoz <[email protected]>
* Documentation Fixes * Documentation Fixes --------- Co-authored-by: Roja Reddy Sareddy <[email protected]>
/.mypy_cache | ||
|
||
/doc/_apidoc/ | ||
doc/_build/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this needs to be /doc/_build/
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mainly to make sure _build
is ignored in git Version control system
source {venv-name}/bin/activate | ||
``` | ||
```{note} | ||
Remember to activate your virtual environment (source {venv-name}/bin/activate) each time you want to use the HyperPod CLI and SDK if you chose the virtual environment installation method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add code quote around source {venv-name}/bin/activate
--image pytorch/pytorch:latest \ | ||
``` | ||
```` | ||
````{tab-item} SDK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is SDK code keeping parity with CLI here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be a fast-follow item
``` | ||
```` | ||
````{tab-item} SDK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like SDK code here is still using some optional variables
```` | ||
````{tab-item} SDK | ||
```python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to update SDK code here too
# Custom endpoint | ||
hyp list-pods hyp-custom-endpoint | ||
``` | ||
```` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing SDK code here
# Custom endpoint | ||
hyp get-logs hyp-custom-endpoint --pod-name <pod-name> | ||
``` | ||
```` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing SDK code here
|
||
List all HyperPod PyTorch jobs in a namespace. | ||
|
||
#### Syntax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like Syntax
is even bigger then hyp list hyp-pytorch-job
, not sure why the rendering is like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, mainly CSS changes required.
would be a fast follow as well.
::: | ||
|
||
:::{grid-item-card} HyperPod Developer Guide | ||
:link: https://catalog.workshops.aws/sagemaker-hyperpod-eks/en-US |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link seems to be the same as the workshop. Maybe needs an update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, checking with Shweta on this.
* Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes --------- Co-authored-by: Roja Reddy Sareddy <[email protected]>
doc/inference.md
Outdated
When creating an inference endpoint, you'll need to specify: | ||
|
||
- **endpoint-name**: Unique identifier for your endpoint | ||
- **instance-type**: The EC2 instance type to use | ||
- **model-id** (JumpStart): ID of the pre-trained JumpStart model | ||
- **image-uri** (Custom): Docker image containing your inference code | ||
- **model-name** (Custom): Name of model to create on SageMaker | ||
- **model-source-type** (Custom): Source type: fsx or s3 | ||
- **model-volume-mount-name** (Custom): Name of the model volume mount | ||
- **container-port** (Custom): Port on which the model server listens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we separate this into 2
- Parameters required for Jumpstart
- Parameters required for Custom
doc/installation.md
Outdated
### Supported ML Frameworks | ||
- PyTorch (version ≥ 1.10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Supported ML Frameworks for Training
maybe
def test_set_cluster_context(self, cluster_name): | ||
"""Test setting cluster context.""" | ||
result = execute_command([ | ||
"hyp", "set-cluster-context", | ||
"--cluster-name", cluster_name | ||
]) | ||
assert result.returncode == 0 | ||
context_line = result.stdout.strip().splitlines()[-1] | ||
assert any(text in context_line for text in ["Updated context", "Added new context"]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change needed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this change is from other commits. Can you rebase to main to clean it up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I merged in the latest changes from main and this change is shown up as diff. Change is from this PR: https://github.com/aws/sagemaker-hyperpod-cli/pull/184/files
* Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes --------- Co-authored-by: Roja Reddy Sareddy <[email protected]>
* Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes --------- Co-authored-by: Roja Reddy Sareddy <[email protected]>
PR to merge all the documentation change to main branch for public launch
PR Approval Steps
For Requester
For Reviewer
For Requester
section to double check each item.