fp8

Here are 9 public repositories matching this topic...

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.

python machine-learning deep-learning gpu cuda pytorch jax fp8

Updated Aug 1, 2025
Python

Azure / MS-AMP

Star

Microsoft Automatic Mixed Precision Library

deep-learning gpu amp pytorch transformer mixed-precision fp8

Updated Sep 29, 2024
Python

intel / neural-speed

Star

An innovative library for efficient LLM inference via low-bit quantization

Updated Aug 30, 2024
C++

aredden / flux-fp8-api

Star

Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.

flux pytorch quantization diffusion fast-inference fp8

Updated Oct 12, 2024
Python

graphcore-research / jax-scalify

Star

JAX Scalify: end-to-end scaled arithmetics

jax low-precision llm fp8

Updated Oct 30, 2024
Python

klessydra / spike-with-minifloat-fp8-support

Star

Spike, a RISC-V ISA Simulator with added 8-bit vector floating point support

spike riscv minifloat fp8

Updated Jul 11, 2025
C

zsxkib / cog-step-video-t2v

Star

Cog Single GPU Quantized Implementation of Step-Video-T2V

replicate single-gpu fp8 h100 step-video-t2v diffsynth

Updated Feb 25, 2025
Python

zerfoo / zerfoo

Star

A modular, accelerator-ready machine learning framework built in Go that speaks float8/16/32/64. Designed with clean architecture, strong typing, and native concurrency for scalable, production-ready AI systems. Ideal for engineers who value simplicity, speed, and maintainability.

go golang machine-learning deep-learning neural-network transformer fp16 autodiff distributed-training float16 onnx graph-ml ml-framework fp8 float8

Updated Aug 3, 2025
Go

umangyadav / py_fp8

Star

FP8 dtypes enumeration in python

fp8 fp8e4m3fnuz fp8e4m3 fp8e5m2 fp8e5m2fnuz

Updated Nov 16, 2023
C++

Improve this page

Add a description, image, and links to the fp8 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the fp8 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fp8

Here are 9 public repositories matching this topic...

NVIDIA / TransformerEngine

Azure / MS-AMP

intel / neural-speed

aredden / flux-fp8-api

graphcore-research / jax-scalify

klessydra / spike-with-minifloat-fp8-support

zsxkib / cog-step-video-t2v

zerfoo / zerfoo

umangyadav / py_fp8

Improve this page

Add this topic to your repo