Skip to content

Conversation

@spzala
Copy link
Contributor

@spzala spzala commented Sep 17, 2025

Add readme doc for the paged programs script.

@spzala spzala marked this pull request as draft September 17, 2025 16:23
@spzala
Copy link
Contributor Author

spzala commented Sep 17, 2025

cc @JRosenkranz


## How to run and validate paged programs

The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. The script can run tests in a distributed environment, utilizing multiple instances for faster execution. To see the description of various command-line arguments that the script can parse, run it with `--help`. The following examples demonstrate the usage of the script.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to add a separate readme entirely as well for this file, that includes some example output for different test_types (tokens and metrics).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JRosenkranz I was thinking to create a new folder called drive_paged_program and move the drive_paged_programs.py and add newly created dedicated readme but for now I have just added a readme with a specific name README_drive_paged_program.md. Adding folder may give better clarity but not sure if that can break anything with script path. Let me know your thoughts. Thanks!


## How to run and validate paged programs

The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. The script can run tests in a distributed environment, utilizing multiple instances for faster execution. To see the description of various command-line arguments that the script can parse, run it with `--help`. The following examples demonstrate the usage of the script.
Copy link
Contributor

@JRosenkranz JRosenkranz Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may also want to point out that you can skip cpu validation (skip_validation) which will make the script much faster, as well as utilize the validation_info_outputs_dir and save_validation_info_outputs which will allow you to re-use saved cpu logits (to avoid re-compute -- significantly reducing time of the script to run)

@tharapalanivel tharapalanivel self-requested a review September 17, 2025 17:30
Copy link
Collaborator

@tharapalanivel tharapalanivel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks for putting this doc together @spzala!


```bash
# Run with 4K context length
VLLM_DT_MAX_BATCH_SIZE=4 VLLM_DT_MAX_CONTEXT_LEN=4096 HF_HUB_CACHE=/home/senuser/models/huggingface_cache/hub DT_DEEPRT_VERBOSE=-1 DTLOG_LEVEL=error torchrun --nproc-per-node=4 /home/senuser/aiu-fms-testing-utils/scripts/drive_paged_programs.py --max_new_tokens=8 --model_variant=ibm-granite/granite-3.3-8b-instruct --program_criteria_json_path=/home/senuser/models/fms-tests-dpp-programs/dpp-4k.json --dataset_path=/home/senuser/models/ShareGPT_V3_unfiltered_cleaned_split.json --test_type=tokens --distributed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the paths generic here please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I kept the path similar to other script examples but good idea to clarify it and will do.


```bash
# Run with 4K context length
VLLM_DT_MAX_BATCH_SIZE=4 VLLM_DT_MAX_CONTEXT_LEN=4096 HF_HUB_CACHE=/home/senuser/models/huggingface_cache/hub DT_DEEPRT_VERBOSE=-1 DTLOG_LEVEL=error torchrun --nproc-per-node=4 /home/senuser/aiu-fms-testing-utils/scripts/drive_paged_programs.py --max_new_tokens=8 --model_variant=ibm-granite/granite-3.3-8b-instruct --program_criteria_json_path=/home/senuser/models/fms-tests-dpp-programs/dpp-4k.json --dataset_path=/home/senuser/models/ShareGPT_V3_unfiltered_cleaned_split.json --test_type=tokens --distributed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do users get access to /home/senuser/models/fms-tests-dpp-programs/dpp-4k.json or does it get generated during the test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a generated file, stored to the provided path. I will clarify it in the doc.

parser.add_argument(
"--program_criteria_json_path",
type=str,
required=True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we planning on providing a default program_criteria_json_path example in this repo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this gets generated by default now, so I think we can set a default for this and turn off the required flag

@spzala
Copy link
Contributor Author

spzala commented Sep 17, 2025

@JRosenkranz @tharapalanivel thanks so much for the quick feedback. I will work on your suggestions.

@spzala spzala marked this pull request as ready for review October 1, 2025 12:49
@@ -0,0 +1,76 @@
The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant.

It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should mention that this can also support a custom prompt file and the format for that file. Currently it is for a given file, each line will be one sequence in the batch, the batch size being the number of lines in the file.

@@ -0,0 +1,76 @@
The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant.

It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have since added a lot of support for the programs argument in the script. We should have a different set of examples of how to use that. For instance, the current format is as follows:

<program_id>:<batch_constraint>,<seq_len_constraint>

<program_id> can be one of an int, *, or ?. If an int, it will choose the exact program id. If *, it will choose all programs that match the batch_constraint and seq_len_constraint criteria. If ?, it will choose one program that matches the batch_constraint and seq_len_constraint criteria

<batch_constraint> can be one of int or conditional expression on the batch size. Int will default to >= expression. Otherwise we can support >, >=, <, <=, == with a val.

<seq_len_constraint> can be one of int or conditional expression on the sequence length. Int will default to >= expression. Otherwise we can support >, >=, <, <=, == with a val.

The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant.

It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to also explain the enforce_homogeneous_prompt_programs param. This is used to ensure that all sequences in a batch would hit the same prefill program (by default we only ensure the largest prompt hits a specific prefill program)

@Ssukriti Ssukriti marked this pull request as draft November 4, 2025 00:02
Signed-off-by: Sukriti-Sharma4 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants