Skip to content

Add Simulated Quantized Linear Operator for Debugging #5862

@Fridah-nv

Description

@Fridah-nv

Similar to the torch reference ops we have for attention and RoPE, create pure Python implementation for each quantized op to make it easier to debug our pipeline.

Metadata

Metadata

Assignees

Labels

AutoDeploy<NV> AutoDeploy Backend

Type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions