FYI - ongoing PR to integrate this great work into vLLM (vllm-project/vllm#3905)
I ran into a couple correctness issues with a few shapes running through our CI
To Repo:
git clone https://github.com/neuralmagic/nm-vllm.git
cd vllm
git checkout fused-moe
pip install -e .
pip install -r requirements-dev.txt
pytest -v tests/kernels/test_moe.py::test_fused_moe