Skip to content

[Doc]: Supported Hardware for Quantization Kernels #6979

@YangwdX

Description

@YangwdX

📚 The doc issue

I'm confused what "the quantization method is supported" mean? Ampere arch doesn't support FP8, according to Nvidia. So does this mean the FP8 operation is supported on A100/A800 GPU? Or just we can conver the weight parameters form FP16 to FP8?

Suggest a potential alternative/fix

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions