Lightweight & fast AI inference proxy for self-hosted LLMs backends like Ollama, LM Studio and others. Designed for speed, simplicity and local-first deployments.
-
Updated
Aug 9, 2025 - Go
Lightweight & fast AI inference proxy for self-hosted LLMs backends like Ollama, LM Studio and others. Designed for speed, simplicity and local-first deployments.
A self-hosted, open-source (Apache 2.0) proxy for LLM's with prometheus metrics
Add a description, image, and links to the llm-proxy topic page so that developers can more easily learn about it.
To associate your repository with the llm-proxy topic, visit your repo's landing page and select "manage topics."