Skip to content
View noiji's full-sized avatar

Block or report noiji

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. hiyouga/LLaMA-Factory hiyouga/LLaMA-Factory Public

    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

    Python 57.9k 7.1k

  2. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 57.7k 10.1k

  3. lm-sys/FastChat lm-sys/FastChat Public

    An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

    Python 39.1k 4.7k

  4. BerriAI/litellm BerriAI/litellm Public

    Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

    Python 28.7k 4.1k

  5. NVIDIA/TensorRT-LLM NVIDIA/TensorRT-LLM Public

    TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

    C++ 11.6k 1.7k

  6. microsoft/LLMLingua microsoft/LLMLingua Public

    [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

    Python 5.4k 324