Skip to content

An Elixir library for integrating and orchestrating large language models (LLMs) via HTTP, supporting OpenAI, Ollama, and other future backends.

License

Notifications You must be signed in to change notification settings

doofinder/llm_composer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LlmComposer

LlmComposer is an Elixir library that simplifies the interaction with large language models (LLMs) such as OpenAI's GPT, providing a streamlined way to build and execute LLM-based applications or chatbots. It currently supports multiple model providers, including OpenAI, OpenRouter, Ollama, Bedrock, and Google (Gemini), with features like auto-execution of functions and customizable prompts to cater to different use cases.

Table of Contents

Installation

If available in Hex, the package can be installed by adding llm_composer to your list of dependencies in mix.exs:

def deps do
  [
    {:llm_composer, "~> 0.8.0"}
  ]
end

Provider Compatibility

The following table shows which features are supported by each provider:

Feature OpenAI OpenRouter Ollama Bedrock Google
Basic Chat âś… âś… âś… âś… âś…
Streaming ✅ ✅ ✅ ❌ ✅
Function Calls ✅ ✅ ❌ ❌ ✅
Auto Function Execution ✅ ✅ ❌ ❌ ✅
Structured Outputs ❌ ✅ ❌ ❌ ✅
Fallback Models ❌ ✅ ❌ ❌ ❌
Provider Routing ❌ ✅ ❌ ❌ ❌

Notes:

  • OpenRouter offers the most comprehensive feature set, including unique capabilities like fallback models and provider routing
  • Google provides full feature support including function calls, structured outputs, and streaming with Gemini models
  • Bedrock support is provided via AWS ExAws integration and requires proper AWS configuration
  • Ollama requires an ollama server instance to be running
  • Function Calls require the provider to support OpenAI-compatible function calling format
  • Streaming is not compatible with Tesla retries.

Usage

Simple Bot Definition

To create a basic chatbot using LlmComposer, you need to define a module that uses the LlmComposer.Caller behavior. The example below demonstrates a simple configuration with OpenAI as the model provider:

Application.put_env(:llm_composer, :openai_key, "<your api key>")

defmodule MyChat do

  @settings %LlmComposer.Settings{
    provider: LlmComposer.Providers.OpenAI,
    provider_opts: [model: "gpt-4o-mini"],
    system_prompt: "You are a helpful assistant."
  }

  def simple_chat(msg) do
    LlmComposer.simple_chat(@settings, msg)
  end
end

{:ok, res} = MyChat.simple_chat("hi")

IO.inspect(res.main_response)

Example of execution:

mix run sample.ex

16:41:07.594 [debug] input_tokens=18, output_tokens=9
LlmComposer.Message.new(
  :assistant,
  "Hello! How can I assist you today?"
)

This will trigger a conversation with the assistant based on the provided system prompt.

Using old messages

For more control over the interactions, basically to send the messages history and track the context, you can use the run_completion/3 function directly.

Here’s an example that demonstrates how to use run_completion with a custom message flow:

Application.put_env(:llm_composer, :openai_key, "<your api key>")

defmodule MyCustomChat do

  @settings %LlmComposer.Settings{
    provider: LlmComposer.Providers.OpenAI,
    provider_opts: [model: "gpt-4o-mini"],
    system_prompt: "You are an assistant specialized in history.",
    functions: []
  }

  def run_custom_chat() do
    # Define a conversation history with user and assistant messages
    messages = [
      LlmComposer.Message.new(:user, "What is the Roman Empire?"),
      LlmComposer.Message.new(:assistant, "The Roman Empire was a period of ancient Roman civilization with an autocratic government."),
      LlmComposer.Message.new(:user, "When did it begin?")
    ]

    {:ok, res} = LlmComposer.run_completion(@settings, messages)

    res.main_response
  end
end

IO.inspect(MyCustomChat.run_custom_chat())

Example of execution:

mix run custom_chat.ex

16:45:10.123 [debug] input_tokens=85, output_tokens=47
LlmComposer.Message.new(
  :assistant,
  "The Roman Empire began in 27 B.C. after the end of the Roman Republic, and it continued until 476 A.D. in the West."
)

Using Ollama Backend

LlmComposer also supports the Ollama backend, allowing interaction with models hosted on Ollama.

Make sure to start the Ollama server first.

# Set the Ollama URI in the application environment if not already configured
# Application.put_env(:llm_composer, :ollama_uri, "http://localhost:11434")

defmodule MyChat do

  @settings %LlmComposer.Settings{
    provider: LlmComposer.Providers.Ollama,
    provider_opts: [model: "llama3.1"],
    system_prompt: "You are a helpful assistant."
  }

  def simple_chat(msg) do
    LlmComposer.simple_chat(@settings, msg)
  end
end

{:ok, res} = MyChat.simple_chat("hi")

IO.inspect(res.main_response)

Example of execution:

mix run sample_ollama.ex

17:08:34.271 [debug] input_tokens=, output_tokens=
LlmComposer.Message.new(
  :assistant,
  "How can I assist you today?",
  %{
    original: %{
      "content" => "How can I assist you today?",
      "role" => "assistant"
    }
  }
)

Note: Ollama does not provide token usage information, so input_tokens and output_tokens will always be empty in debug logs and response metadata. Function calls are also not supported with Ollama.

Streaming Responses

LlmComposer supports streaming responses for real-time output, which is particularly useful for long-form content generation. This feature works with providers that support streaming (like OpenRouter, OpenAI, and Google).

Note: The stream_response: true setting enables streaming mode. When using streaming, LlmComposer does not track input/output/cache/thinking tokens. There are two approaches to handle token counting in this mode:

  1. Calculate tokens using libraries like tiktoken for OpenAI provider.
  2. Read token data from the last stream object if the provider supplies it (currently only OpenRouter supports this).

Using OpenRouter

LlmComposer supports integration with OpenRouter, giving you access to a variety of LLM models through a single API compatible with OpenAI's interface. Also supports, the OpenRouter's feature of setting fallback models.

To use OpenRouter with LlmComposer, you'll need to:

  1. Sign up for an API key from OpenRouter
  2. Configure your application to use OpenRouter's endpoint

Here's a complete example:

# Configure the OpenRouter API key and endpoint
Application.put_env(:llm_composer, :open_router_key, "<your openrouter api key>")

defmodule MyOpenRouterChat do
  @settings %LlmComposer.Settings{
    provider: LlmComposer.Providers.OpenRouter,
    # Use any model available on OpenRouter
    provider_opts: [
      model: "anthropic/claude-3-sonnet",
      models: ["openai/gpt-4o", "fallback-model2"],
      provider_routing: %{
        order: ["openai", "azure"]
      }
    ],
    system_prompt: "You are a SAAS consultant"
  }

  def simple_chat(msg) do
    LlmComposer.simple_chat(@settings, msg)
  end
end

{:ok, res} = MyOpenRouterChat.simple_chat("Why doofinder is so awesome?")

IO.inspect(res.main_response)

Example of execution:

mix run openrouter_sample.ex

17:12:45.124 [debug] input_tokens=42, output_tokens=156
LlmComposer.Message.new(
  :assistant,
  "Doofinder is an excellent site search solution for ecommerce websites. Here are some reasons why Doofinder is considered awesome:...
)

Structured Outputs

OpenRouter/Google/Openai supports structured outputs by allowing you to specify a response_format in the provider options. This enables the model to return responses conforming to a defined JSON schema, which is helpful for applications requiring strict formatting and validation of the output.

To use structured outputs, include the response_format key inside your provider_opts in the settings, like this:

settings = %LlmComposer.Settings{
  provider: LlmComposer.Providers.OpenRouter,
  provider_opts: [
    model: "google/gemini-2.5-flash",
    response_format: %{
      type: "json_schema",
      json_schema: %{
        name: "my_response",
        strict: true,
        schema: %{
          "type" => "object",
          "properties" => %{
            "answer" => %{"type" => "string"},
            "confidence" => %{"type" => "number"}
          },
          "required" => ["answer"]
        }
      }
    }
  ]
}

The model will then produce responses that adhere to the specified JSON schema, making it easier to parse and handle results programmatically.

Note: This feature is currently supported only on the OpenRouter and Google provider in llm_composer.

Using AWS Bedrock

LlmComposer also integrates with Bedrock via its Converse API. This allows you tu use Bedrock as provider with any of its supported models.

Currently, function execution is not supported with Bedrock.

To integrate with Bedrock, LlmComposer uses the ex_aws to perform its requests. So, if you plan to use Bedrock, make sure that you have configured ex_aws as per the official documentation of the library.

Here's a complete example:

# In your config files:
config :ex_aws,
  access_key_id: "your key",
  secret_access_key: "your secret"
---

defmodule MyBedrockChat do
  @settings %LlmComposer.Settings{
    provider: LlmComposer.Providers.Bedrock,
    # Use any model available Bedrock model
    provider_opts: [model: "eu.amazon.nova-lite-v1:0"],
    system_prompt: "You are an expert in Quantum Field Theory."
  }

  def simple_chat(msg) do
    LlmComposer.simple_chat(@settings, msg)
  end
end

{:ok, res} = MyBedrockChat.simple_chat("What is the wave function collapse? Just a few sentences")

IO.inspect(res.main_response)

Example of execution:

%LlmComposer.Message{
  type: :assistant,
  content: "Wave function collapse is a concept in quantum mechanics that describes the transition of a quantum system from a superposition of states to a single definite state upon measurement. This phenomenon is often associated with the interpretation of quantum mechanics, particularly the Copenhagen interpretation, and it remains a topic of ongoing debate and research in the field."
}

Using Google (Gemini)

LlmComposer supports Google's Gemini models through the Google AI API. This provider offers comprehensive features including function calls, streaming responses, auto function execution, and structured outputs.

To use Google with LlmComposer, you'll need to:

  1. Get an API key from Google AI Studio
  2. Configure your application with the Google API key

Basic Google Chat Example

# Configure the Google API key
Application.put_env(:llm_composer, :google_key, "<your google api key>")

defmodule MyGoogleChat do
  @settings %LlmComposer.Settings{
    provider: LlmComposer.Providers.Google,
    provider_opts: [model: "gemini-2.5-flash"],
    system_prompt: "You are a helpful assistant."
  }

  def simple_chat(msg) do
    LlmComposer.simple_chat(@settings, msg)
  end
end

{:ok, res} = MyGoogleChat.simple_chat("What is quantum computing?")

IO.inspect(res.main_response)

Note: Google provider supports all major LlmComposer features including function calls, structured outputs, and streaming. The provider uses Google's Gemini models and requires a Google AI API key.

Using Vertex AI

LlmComposer also supports Google's Vertex AI platform, which provides enterprise-grade AI capabilities with enhanced security and compliance features. Vertex AI requires OAuth 2.0 authentication via the Goth library.

Dependencies

Add Goth to your dependencies for Vertex AI authentication:

def deps do
  [
    {:llm_composer, "~> 0.8.0"},
    {:goth, "~> 1.4"}  # Required for Vertex AI
  ]
end
Service Account Setup
  1. Create a service account in Google Cloud Console
  2. Grant the following IAM roles:
    • Vertex AI User or Vertex AI Service Agent
    • Service Account Token Creator (if using impersonation)
  3. Download the JSON credentials file
Basic Vertex AI Example
# Read service account credentials
google_json = File.read!(Path.expand("~/path/to/service-account.json"))
credentials = Jason.decode!(google_json)

# Optional: Configure HTTP client for Goth with retries
http_client = fn opts ->
  client = Tesla.client([
    {Tesla.Middleware.Retry,
     delay: 500,
     max_retries: 2,
     max_delay: 1_000,
     should_retry: fn
       {:ok, %{status: status}}, _env, _context when status in [400, 500] -> true
       {:ok, _reason}, _env, _context -> false
       {:error, _reason}, %Tesla.Env{method: :post}, _context -> false
       {:error, _reason}, %Tesla.Env{method: :put}, %{retries: 2} -> false
       {:error, _reason}, _env, _context -> true
     end
   }
  ])
  Tesla.request(client, opts)
end

# Start Goth process
{:ok, _pid} = Goth.start_link([
  source: {:service_account, credentials},
  http_client: http_client,  # Optional: improves reliability
  name: MyApp.Goth
])

# Configure LlmComposer to use your Goth process
Application.put_env(:llm_composer, :google_goth, MyApp.Goth)

defmodule MyVertexChat do
  @settings %LlmComposer.Settings{
    provider: LlmComposer.Providers.Google,
    provider_opts: [
      model: "gemini-2.5-flash",
      vertex: %{
        project_id: "my-gcp-project",
        location_id: "global"  # or specific region like "us-central1"
      }
    ],
    system_prompt: "You are a helpful assistant."
  }

  def simple_chat(msg) do
    LlmComposer.simple_chat(@settings, msg)
  end
end

{:ok, res} = MyVertexChat.simple_chat("What are the benefits of Vertex AI?")

IO.inspect(res.main_response)
Production Setup with Supervision Tree

For production applications, add Goth to your supervision tree:

# In your application.ex
defmodule MyApp.Application do
  use Application

  def start(_type, _args) do
    google_json = File.read!(Application.get_env(:my_app, :google_credentials_path))
    credentials = Jason.decode!(google_json)

    children = [
      # Other children...
      {Goth, name: MyApp.Goth, source: {:service_account, credentials}},
      # ... rest of your children
    ]

    opts = [strategy: :one_for_one, name: MyApp.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

# Configure in config.exs
config :llm_composer, :google_goth, MyApp.Goth
config :my_app, :google_credentials_path, "/path/to/service-account.json"
Vertex AI Configuration Options

The :vertex map accepts the following options:

  • :project_id (required) - Your Google Cloud project ID
  • :location_id (required) - The location/region for your Vertex AI endpoint (e.g., "us-central1", "global")
  • :api_endpoint (optional) - Custom API endpoint (overrides default regional endpoint)

Note: Vertex AI provides the same feature set as Google AI API but with enterprise security, audit logging, and VPC support. All LlmComposer features including function calls, streaming, and structured outputs are fully supported.

Bot with external function call

You can enhance the bot's capabilities by adding support for external function execution. This example demonstrates how to add a simple calculator that evaluates basic math expressions:

Application.put_env(:llm_composer, :openai_key, "<your api key>")

defmodule MyChat do

  @settings %LlmComposer.Settings{
    provider: LlmComposer.Providers.OpenAI,
    provider_opts: [model: "gpt-4o-mini"],
    system_prompt: "You are a helpful math assistant that assists with calculations.",
    auto_exec_functions: true,
    functions: [
      %LlmComposer.Function{
        mf: {__MODULE__, :calculator},
        name: "calculator",
        description: "A calculator that accepts math expressions as strings, e.g., '1 * (2 + 3) / 4', supporting the operators ['+', '-', '*', '/'].",
        schema: %{
          type: "object",
          properties: %{
            expression: %{
              type: "string",
              description: "A math expression to evaluate, using '+', '-', '*', '/'.",
              example: "1 * (2 + 3) / 4"
            }
          },
          required: ["expression"]
        }
      }
    ]
  }

  def simple_chat(msg) do
    LlmComposer.simple_chat(@settings, msg)
  end

  @spec calculator(map()) :: number() | {:error, String.t()}
  def calculator(%{"expression" => expression}) do
    # Basic validation pattern to prevent arbitrary code execution
    pattern = ~r/^[0-9\.\s\+\-\*\/\(\)]+$/

    if Regex.match?(pattern, expression) do
      try do
        {result, _binding} = Code.eval_string(expression)
        result
      rescue
        _ -> {:error, "Invalid expression"}
      end
    else
      {:error, "Invalid expression format"}
    end
  end
end

{:ok, res} = MyChat.simple_chat("hi, how much is 1 + 2?")

IO.inspect(res.main_response)

Example of execution:

mix run functions_sample.ex

16:38:28.338 [debug] input_tokens=111, output_tokens=17

16:38:28.935 [debug] input_tokens=136, output_tokens=9
LlmComposer.Message.new(
  :assistant,
  "1 + 2 is 3."
)

In this example, the bot first calls OpenAI to understand the user's intent and determine that a function (the calculator) should be executed. The function is then executed locally, and the result is sent back to the user in a second API call.

Cost Tracking

LlmComposer provides built-in cost tracking functionality, for OpenRouter backend only, to monitor token usage and associated costs across different providers. This feature helps you keep track of API expenses and optimize your usage.

Requirements

To use cost tracking, you need:

  1. Decimal package: Add {:decimal, "~> 2.0"} to your dependencies in mix.exs
  2. Cache backend: A cache implementation for storing cost data (LlmComposer provides an ETS-based cache by default, or you can implement a custom one using LlmComposer.Cache.Behaviour)

Basic Cost Tracking Example

Application.put_env(:llm_composer, :open_router_key, "<your openrouter api key>")

defmodule MyCostTrackingChat do
  @settings %LlmComposer.Settings{
    provider: LlmComposer.Providers.OpenRouter,
    provider_opts: [model: "meta-llama/llama-3.2-3b-instruct"],
    system_prompt: "You are a helpful assistant.",
    track_costs: true
  }

  def run_chat_with_costs() do
    messages = [
      %LlmComposer.Message{type: :user, content: "How much is 1 + 1?"}
    ]
    
    {:ok, res} = LlmComposer.run_completion(@settings, messages)
    
    # Access cost information from the response
    IO.puts("Input tokens: #{res.input_tokens}")
    IO.puts("Output tokens: #{res.output_tokens}")
    IO.puts("Total cost: #{Decimal.to_string(res.metadata.total_cost, :normal)}$")
    
    res
  end
end

# Start the cache backend (required for cost tracking)
# The default ETS cache can be overridden by configuring a custom cache module:
#
# config :llm_composer, cache_mod: MyCustomCache
#
# Your custom cache module must implement the LlmComposer.Cache.Behaviour
# which defines callbacks for get/1, put/3, and delete/1 operations.
{:ok, _} = LlmComposer.Cache.Ets.start_link()

MyCostTrackingChat.run_chat_with_costs()

Starting Cache in a Supervision Tree

For production applications, you should start the cache as part of your application's supervision tree:

# In your application.ex file
defmodule MyApp.Application do
  use Application

  def start(_type, _args) do
    children = [
      # Other supervisors/workers...
      LlmComposer.Cache.Ets,
      # ... rest of your children
    ]

    opts = [strategy: :one_for_one, name: MyApp.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

Dependencies Setup

Add the decimal dependency to your mix.exs:

def deps do
  [
    {:llm_composer, "~> 0.8.0"},
    {:decimal, "~> 2.3"}  # Required for cost tracking
  ]
end

Note: Cost tracking calculates expenses based on the provider's pricing model and token usage. The cache backend stores pricing information to avoid repeated lookups and improve performance.

Additional Features

  • Auto Function Execution: Automatically executes predefined functions, reducing manual intervention.
  • System Prompts: Customize the assistant's behavior by modifying the system prompt (e.g., creating different personalities or roles for your bot).

Documentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/llm_composer.

About

An Elixir library for integrating and orchestrating large language models (LLMs) via HTTP, supporting OpenAI, Ollama, and other future backends.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 8

Languages