Data-Aware Calibration

The current calibration algorithm use `global_scale` to estimate the scale of intermediate result, which is configured by user. For better accuracy loss, we need to implement data-aware calibration algorithm: given a small calibration dataset (like 100 samples), to achieve better scale estimation.

## Implementation
### Collecting Intermediate Results
To collect intermediate results of every operation in the graph, we may need to implement a Visitor to return every `Call` as output. The API would be: `stats = collect_stats(graph, data)`

### Modification for SimulatedQuantize
It is kinds of hard to save the mapping from origin operator to the operator after annotation. So we may need to add an option for `simulated_quantize`, like `mode='origin'`, which denotes that `sq` will not simulate the rounding error/saturated error, instead, just return the input directly. With this, we can collect the intermediate result of the original graph.

### Calibration Procedure 
Having the calibration data, we can adjust the scale according to the output of the annotated graph. It is actually an optimization problem. The target can be the KL divergence between outputs of original graph and annotated graph, and the adjustment method can be simply search-based or learning-based. There should be lots of room for exploration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data-Aware Calibration #2651

Implementation

Collecting Intermediate Results

Modification for SimulatedQuantize

Calibration Procedure

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Data-Aware Calibration #2651

Description

Implementation

Collecting Intermediate Results

Modification for SimulatedQuantize

Calibration Procedure

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions