Micro-benchmark inference #1759

jainapurva · 2025-02-21T23:45:19Z

This is the first PR, in the benchmarking effort. It provides the outline to setup inference microbenchmarking for quant api's in torchao.

The different inputs like quantization techniques, matrix sizes, compile, sparsity etc, will be given as input to the python script. The options are re-defined in the scipt for quantization techniques, and a developer can add new quant technique. The script will generate a csv with performance numbers, and that'll be used to plot charts and as an input to dashboard . The script performs the following tasks:

Take input as .yml
Performs benchmarking for quantize_ APIs eval time for configurations
Record all the config params and their respective time in csv file.
Test cases

Future PRs will include more config options and process the generated results.

benchmark_config.yml

# Sample configuration for inference kernel benchmarks
quantization_config_recipe_names:
  - "baseline"  # String format is same as llama/generate.py
  - "int8wo"
  - "int4wo-128"
output_dir: "benchmarks/microbenchmarks/test/results"  # Directory for results and plots
model_params:
  matrix_shapes:
    - name: "custom"
      shapes: [
        [1024, 1024, 1024],  # [m, k, n]
        [2048, 4096, 1024],
        [4096, 4096, 1024]
      ]
  high_precision_dtype: "torch.bfloat16"
  use_torch_compile: true
  torch_compile_mode: "max-autotune"
  device: "cuda"  # Change this to "cuda", "mps", "xpu", or "cpu" as needed
  model_type: "linear"

Run command:

python benchmarks/microbenchmarks/benchmark_runner.py --config benchmarks/microbenchmarks/benchmark_config.yml

Output will be stored in

benchmarks/microbenchmarks/results/results.csv

pytorch-bot · 2025-02-21T23:45:22Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1759

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

benchmarks/microbenchmarks/test/benchmark_config.yml

benchmarks/microbenchmarks/benchmark_inference.py

benchmarks/microbenchmarks/test/test_benchmark_inference.py

benchmarks/microbenchmarks/test/benchmark_config.yml

HDCharles

lgtm

benchmarks/microbenchmarks/README.md

benchmarks/microbenchmarks/utils.py

benchmarks/microbenchmarks/benchmark_inference.py

benchmarks/microbenchmarks/test/benchmark_config.yml

benchmarks/microbenchmarks/utils.py

Add files

7a07885

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 21, 2025

jainapurva and others added 3 commits February 24, 2025 21:30

Add basic benchmarks for inference

c8ddfdb

Update new quantize_ api

a56f3ab

Updates

24bea42

jainapurva force-pushed the bench_structure branch from 0fc7212 to 24bea42 Compare February 25, 2025 20:17

jainapurva added topic: new feature Use this tag if this PR adds a new feature topic: performance Use this tag if this PR improves the performance of a feature labels Feb 25, 2025

jainapurva requested a review from HDCharles February 25, 2025 20:32

jainapurva marked this pull request as ready for review February 25, 2025 22:02

jainapurva and others added 3 commits February 25, 2025 15:31

New test folder

97cea12

Added test cases

35b2840

Lint fixes

8b7291c

jainapurva force-pushed the bench_structure branch from e9c3a10 to a750f7c Compare February 26, 2025 05:23

jainapurva requested review from drisspg, vkuzo and jerryzh168 February 26, 2025 17:27

Merge remote-tracking branch 'origin/main' into bench_structure

a828f5b

jainapurva force-pushed the bench_structure branch from a750f7c to a828f5b Compare February 26, 2025 17:28

vkuzo reviewed Feb 26, 2025

View reviewed changes

benchmarks/microbenchmarks/test/benchmark_config.yml Outdated Show resolved Hide resolved

vkuzo reviewed Feb 26, 2025

View reviewed changes

benchmarks/microbenchmarks/test/benchmark_config.yml Outdated Show resolved Hide resolved

vkuzo reviewed Feb 26, 2025

View reviewed changes

benchmarks/microbenchmarks/test/benchmark_config.yml Outdated Show resolved Hide resolved

vkuzo reviewed Feb 26, 2025

View reviewed changes

benchmarks/microbenchmarks/test/benchmark_config.yml Outdated Show resolved Hide resolved

vkuzo reviewed Feb 26, 2025

View reviewed changes

benchmarks/microbenchmarks/benchmark_inference.py Outdated Show resolved Hide resolved

jainapurva requested a review from vkuzo February 27, 2025 22:29

jainapurva force-pushed the bench_structure branch from 9ad0123 to 7a436db Compare February 27, 2025 22:31

drisspg reviewed Feb 27, 2025

View reviewed changes

benchmarks/microbenchmarks/benchmark_inference.py Show resolved Hide resolved

drisspg reviewed Feb 27, 2025

View reviewed changes

benchmarks/microbenchmarks/benchmark_inference.py Outdated Show resolved Hide resolved

drisspg reviewed Feb 27, 2025

View reviewed changes

benchmarks/microbenchmarks/test/test_benchmark_inference.py Outdated Show resolved Hide resolved

Merge remote-tracking branch 'origin/main' into bench_structure

ada00a3

jainapurva force-pushed the bench_structure branch from 87e16ca to 58d897e Compare March 11, 2025 07:17

Updates

cfe2688

jainapurva force-pushed the bench_structure branch from 58d897e to cfe2688 Compare March 11, 2025 18:14

jainapurva requested review from HDCharles, vkuzo and drisspg March 11, 2025 18:40

HDCharles reviewed Mar 12, 2025

View reviewed changes

benchmarks/microbenchmarks/test/benchmark_config.yml Outdated Show resolved Hide resolved

HDCharles approved these changes Mar 12, 2025

View reviewed changes

Minor fix

22b3ddd

jainapurva force-pushed the bench_structure branch from 6cfd1db to 22b3ddd Compare March 12, 2025 04:08