-
Notifications
You must be signed in to change notification settings - Fork 30
Add readme doc for the paged programs script #132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
cc @JRosenkranz |
scripts/README.md
Outdated
|
|
||
| ## How to run and validate paged programs | ||
|
|
||
| The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. The script can run tests in a distributed environment, utilizing multiple instances for faster execution. To see the description of various command-line arguments that the script can parse, run it with `--help`. The following examples demonstrate the usage of the script. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to add a separate readme entirely as well for this file, that includes some example output for different test_types (tokens and metrics).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JRosenkranz I was thinking to create a new folder called drive_paged_program and move the drive_paged_programs.py and add newly created dedicated readme but for now I have just added a readme with a specific name README_drive_paged_program.md. Adding folder may give better clarity but not sure if that can break anything with script path. Let me know your thoughts. Thanks!
scripts/README.md
Outdated
|
|
||
| ## How to run and validate paged programs | ||
|
|
||
| The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. The script can run tests in a distributed environment, utilizing multiple instances for faster execution. To see the description of various command-line arguments that the script can parse, run it with `--help`. The following examples demonstrate the usage of the script. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may also want to point out that you can skip cpu validation (skip_validation) which will make the script much faster, as well as utilize the validation_info_outputs_dir and save_validation_info_outputs which will allow you to re-use saved cpu logits (to avoid re-compute -- significantly reducing time of the script to run)
tharapalanivel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks for putting this doc together @spzala!
scripts/README.md
Outdated
|
|
||
| ```bash | ||
| # Run with 4K context length | ||
| VLLM_DT_MAX_BATCH_SIZE=4 VLLM_DT_MAX_CONTEXT_LEN=4096 HF_HUB_CACHE=/home/senuser/models/huggingface_cache/hub DT_DEEPRT_VERBOSE=-1 DTLOG_LEVEL=error torchrun --nproc-per-node=4 /home/senuser/aiu-fms-testing-utils/scripts/drive_paged_programs.py --max_new_tokens=8 --model_variant=ibm-granite/granite-3.3-8b-instruct --program_criteria_json_path=/home/senuser/models/fms-tests-dpp-programs/dpp-4k.json --dataset_path=/home/senuser/models/ShareGPT_V3_unfiltered_cleaned_split.json --test_type=tokens --distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the paths generic here please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I kept the path similar to other script examples but good idea to clarify it and will do.
scripts/README.md
Outdated
|
|
||
| ```bash | ||
| # Run with 4K context length | ||
| VLLM_DT_MAX_BATCH_SIZE=4 VLLM_DT_MAX_CONTEXT_LEN=4096 HF_HUB_CACHE=/home/senuser/models/huggingface_cache/hub DT_DEEPRT_VERBOSE=-1 DTLOG_LEVEL=error torchrun --nproc-per-node=4 /home/senuser/aiu-fms-testing-utils/scripts/drive_paged_programs.py --max_new_tokens=8 --model_variant=ibm-granite/granite-3.3-8b-instruct --program_criteria_json_path=/home/senuser/models/fms-tests-dpp-programs/dpp-4k.json --dataset_path=/home/senuser/models/ShareGPT_V3_unfiltered_cleaned_split.json --test_type=tokens --distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do users get access to /home/senuser/models/fms-tests-dpp-programs/dpp-4k.json or does it get generated during the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's a generated file, stored to the provided path. I will clarify it in the doc.
scripts/drive_paged_programs.py
Outdated
| parser.add_argument( | ||
| "--program_criteria_json_path", | ||
| type=str, | ||
| required=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we planning on providing a default program_criteria_json_path example in this repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this gets generated by default now, so I think we can set a default for this and turn off the required flag
|
@JRosenkranz @tharapalanivel thanks so much for the quick feedback. I will work on your suggestions. |
6559fdd to
3309773
Compare
Signed-off-by: Sahdev Zala <[email protected]>
3309773 to
136839b
Compare
| @@ -0,0 +1,76 @@ | |||
| The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. | |||
|
|
|||
| It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should mention that this can also support a custom prompt file and the format for that file. Currently it is for a given file, each line will be one sequence in the batch, the batch size being the number of lines in the file.
| @@ -0,0 +1,76 @@ | |||
| The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. | |||
|
|
|||
| It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have since added a lot of support for the programs argument in the script. We should have a different set of examples of how to use that. For instance, the current format is as follows:
<program_id>:<batch_constraint>,<seq_len_constraint>
<program_id> can be one of an int, *, or ?. If an int, it will choose the exact program id. If *, it will choose all programs that match the batch_constraint and seq_len_constraint criteria. If ?, it will choose one program that matches the batch_constraint and seq_len_constraint criteria
<batch_constraint> can be one of int or conditional expression on the batch size. Int will default to >= expression. Otherwise we can support >, >=, <, <=, == with a val.
<seq_len_constraint> can be one of int or conditional expression on the sequence length. Int will default to >= expression. Otherwise we can support >, >=, <, <=, == with a val.
| The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. | ||
|
|
||
| It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to also explain the enforce_homogeneous_prompt_programs param. This is used to ensure that all sequences in a batch would hit the same prefill program (by default we only ensure the largest prompt hits a specific prefill program)
Signed-off-by: Sukriti-Sharma4 <[email protected]>
Signed-off-by: Sukriti-Sharma4 <[email protected]>
Add readme doc for the paged programs script.