You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: benchmarks/microbenchmarks/README.md
+14Lines changed: 14 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -130,6 +130,18 @@ Currently, quantization string is in same format as the one being passed in llam
130
130
max_power: 11
131
131
```
132
132
133
+
- `small_sweep`: Generate a small sweep of shapes with increasing powers of 2 for M, K, N dimensions
134
+
- Parameters:
135
+
- `min_power`: Minimum power of 2 (default: 10, which is 1024)
136
+
- `max_power`: Maximum power of 2 (default: 14, which is 16,384)
137
+
- Note: This generates shapes where M <= K <= N (ensuring increasing order), which produces fewer combinations than the full sweep, and could be good to use for plots like heatmap
138
+
```yaml
139
+
matrix_shapes:
140
+
- name: "small_sweep"
141
+
min_power: 10 # 2^10 = 1024
142
+
max_power: 15 # 2^15 = 32,768
143
+
```
144
+
133
145
- `sweep`: Generate a sweep of shapes with different powers of 2 for M, K, N dimensions
134
146
- Parameters:
135
147
- `min_power`: Minimum power of 2 (default: 8, which is 256)
@@ -142,6 +154,8 @@ Currently, quantization string is in same format as the one being passed in llam
142
154
max_power: 9 # 2^9 = 512
143
155
```
144
156
157
+
158
+
145
159
## Output
146
160
147
161
Results are saved to a CSV file in the specified output directory
f"Running: {config.name} for Quantization: {config.quantization} and Sparsity: {config.sparsity}"
219
+
f"Running: {config.name} for Quantization: {config.quantization} and Sparsity: {config.sparsity} for {config.shape_name}: {config.m, config.k, config.n}"
206
220
)
207
221
result=run_inference(config) # Pass the config object directly
0 commit comments