Skip to content

A suite of library to interact with various Large Language Model (LLM) providers through a unified API and build agentic AI applications.

License

Notifications You must be signed in to change notification settings

hoangvvo/llm-sdk

Repository files navigation

llm-sdk

llm-sdk is an open-source suite for building production LLM workflows with a consistent developer experience across languages. It ships two libraries:

  • LLM SDK – cross-language clients (JavaScript, Rust, Go) that talk to multiple LLM providers through one LanguageModel interface.
  • LLM Agent – a minimal, transparent agent runtime that orchestrates models, tools, and structured output using the SDK under the hood.
Package Language Version Link
llm-sdk JavaScript/TypeScript npm version GitHub
llm-sdk Rust GitHub
llm-sdk Go Go Reference GitHub
llm-agent JavaScript/TypeScript GitHub
llm-agent Rust crates.io GitHub
llm-agent Go Go Reference GitHub

The accompanying Console app demonstrates the libraries end-to-end.

Console Chat Application screenshot

Status: both libraries are currently v0. The SDK surface is largely stable; the Agent API may evolve. Feedback and contributions are welcome.

Why use llm-sdk

  • Supports multiple LLM providers with a unified API.
  • Handles multiple modalities: Text, Image, and Audio.
  • Supports streaming, including for image and audio.
  • Supports citations and reasoning for supported models.
  • Reports token usage and calculates the cost of a request when provided with the model’s pricing information.
  • Unified serialization across programming languages (systems in different languages can work together).
  • Integrates OpenTelemetry for tracing.

LLM SDKs

Choose the language that fits your service and get the same capabilities:

Each implements the TypeScript reference specification in schema/sdk.ts. Request/response payloads (LanguageModelInput, ModelResponse, tool events, etc.) keep identical field names when serialized to JSON so services can interoperate across languages.

Supported providers

Provider Sampling Params Function Calling Structured Output Text Input Image Input Audio Input Citation 1 Text Output Image Output Audio Output Reasoning
OpenAI (Responses) ✅ except top_k,frequency_penalty, presence_penalty, seed
OpenAI (Chat Completion) ✅ except top_k
Anthropic ✅ except frequency_penalty, presence_penalty, seed
Google
Cohere
Mistral ✅ except top_k 🚧

Keys: ✅ supported · 🚧 planned · ➖ not exposed by the provider.

Core interfaces

  • LanguageModel: supplies provider metadata plus generate and stream methods that accept a LanguageModelInput and return unified responses.
  • LanguageModelInput: captures conversation history, sampling parameters, tool definitions, response-format hints, and modality toggles. The SDK adapts this shape to each provider’s API.
  • ModelResponse / PartialModelResponse: normalized outputs (with usage/cost when available) that you can forward directly to other services.
  • Message: building blocks for conversations. Messages represent user, assistant, or tool turns, with a list of parts, each representing a chunk of content in a specific modality:
    • Part: TextPart, ImagePart, AudioPart, SourcePart (for citation), ToolCallPart, ToolResultPart, and ReasoningPart.
  • Tool semantics: function calling and tool-result envelopes share the same schema across providers. The SDK normalizes call IDs, arguments, and error flags so agent runtimes can hydrate rich tool events without per-provider branching.

LLM Agent

llm-agent wraps the SDK to provide a lightweight agent runtime:

  • Agent objects are stateless blueprints that declare instructions, tools, toolkits, and default model settings.
  • Run sessions bind an agent to a specific context value. Sessions resolve dynamic instructions once, initialize toolkit state, and stream model/tool events back to you. You decide whether to invoke agent.run for one-off calls or manage the session lifecycle yourself for multi-turn workflows.
  • Agent items capture every turn: user/assistant messages, model responses (with usage metadata), and rich tool-call records. Append the output list to the next run’s input to continue a conversation.
  • Streaming mirrors non-streaming responses but emits partial deltas and tool events for real-time UX.

The Agent library intentionally stays small (~500 LOC) and avoids hidden prompt templates. Patterns such as memory, planner–executor, and delegation live in the examples/ folders so you can adapt them without fighting framework defaults.

Getting started

Read the full documentation on llm-sdk.hoangvvo.com or start from these guides:

Also check out some popular agent implementations, including:

Note: To run examples, create an .env file in the root folder (folder containing this README) with your API keys.

Agent Patterns

This agent library (not framework) is designed for transparency and control. Unlike many “agentic” frameworks, it ships with no hidden prompt templates or secret parsing rules—and that’s on purpose:

  • Nothing hidden – What you write is what runs. No secret prompts or “special sauce” behind the scenes, so your instructions aren’t quietly overridden.
  • Works in any settings – Many frameworks bake in English-only prompts. Here, the model sees only your words, in whichever language or format.
  • Easy to tweak – Change prompts, parsing, or flow without fighting built-in defaults.
  • Less to debug – Fewer layers mean you can trace exactly where things break.
  • No complex abstraction – Don't waste time learning new concepts or APIs (e.g., “chains”, “graphs”, syntax with special meanings, etc.). Just plain functions and data structures.

LLM in the past was not as powerful as today, so frameworks had to do a lot of heavy lifting to get decent results. But with modern LLMs, much of that complexity is no longer necessary.

Because we keep the core minimal (only 500 LOC!) and do not want to introduce such hidden magic, the library doesn’t bundle heavy agent patterns like hand-off, memory, or planners. Instead, the examples/ folders shows clean, working references you can copy or adapt to see that it can still be used to build complex use cases.

This philosophy is inspired by this blog post.

Comparison with other libraries

The initial version of llm-sdk was developed internally at my company, prior to the existence or knowledge of similar libraries like the Vercel AI SDK or OpenAI Swarm. As a result, it was never intended to compete with or address the limitations of those libraries. As these other libraries matured, llm-sdk continued to evolve independently, focusing on its unique features and use cases, which were designed to be sufficient for its intended applications.

This section is designed to outline the differences for those considering migration to or from llm-sdk or to assert compatibility.

TBD.

License

MIT

Footnotes

  1. Source Input (citation) is not supported by all providers and may be converted to compatible inputs instead.

About

A suite of library to interact with various Large Language Model (LLM) providers through a unified API and build agentic AI applications.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published