😼
When softmax attention is sus
Researcher and employee at Etched.
I Attempt to force machines to not be dumb.
Highlights
- Pro
Pinned Loading
-
Stable-Diffusion-3-From-Scratch
Stable-Diffusion-3-From-Scratch PublicA repo that attempts to train stable diffusion 3 from scratch
-
Cottention_Transformer
Cottention_Transformer PublicCode for the paper "Cottention: Linear Transformers With Cosine Attention"
Cuda 17
-
On-the-Expressiveness-of-Softmax-Attention-A-Recurrent-Neural-Network-Perspective
On-the-Expressiveness-of-Softmax-Attention-A-Recurrent-Neural-Network-Perspective PublicJupyter Notebook 2
-
Diffusion_models_from_scratch
Diffusion_models_from_scratch PublicCreating a diffusion model from scratch in PyTorch to learn exactly how they work.
-
-
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.