NotImplementedError is raised in static INT8 Quantization with PT2E Backend default recipe

**Following example from neural-compressor/docs/source/3x/PT_StaticQuant.md:**

I get a crash in `MinMaxObserver` when calling `prepare(model)`: 
"NotImplementedError: MinMaxObserver's qscheme only support torch.per_tensor_symmetric and torch.per_tensor_affine."
The scheme requested prior to error is `per_channel_symmetric`.

i am using default `StaticQuantConfig`.
##
*i think that `per_channel_symmetric` quantization is done on weights only.
And by using `minmax` algorithm, its simply absolute max for each channel, i'm not sure why observer was used in this case.*
##
Environment:
- Linux Ubuntu 22.04.1 LTS
- Python 3.10
- torch 2.4.0+cpu
- neural_compressor 3.0

To reproduce:
```python
import torch
from torch import nn
from neural_compressor.torch.export import export
from neural_compressor.torch.quantization import StaticQuantConfig, prepare, convert

def main():
    # Prepare the float model and example inputs for export model
    model = nn.Linear(5,5)
    example_inputs = (torch.rand(size=(1,5)),)


    # Export eager model into FX graph model
    exported_model = export(model=model, example_inputs=example_inputs)
    # Quantize the model
    quant_config = StaticQuantConfig()
    prepared_model = prepare(exported_model, quant_config=quant_config)
    # Calibrate
    for _ in range(100):
        prepared_model(torch.rand_like(example_inputs[0]))

    q_model = convert(prepared_model)
    # Compile the quantized model and replace the Q/DQ pattern with Q-operator
    from torch._inductor import config

    config.freezing = True
    opt_model = torch.compile(q_model)

if __name__=="__main__":
    main()
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NotImplementedError is raised in static INT8 Quantization with PT2E Backend default recipe #1984

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

NotImplementedError is raised in static INT8 Quantization with PT2E Backend default recipe #1984

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions