Skip to content

[BUG] on_new_message callback fires after first chunk received instead of before API request #367

@andreaslillebo

Description

@andreaslillebo

Basic checks

  • I searched existing issues - this hasn't been reported
  • I can reproduce this consistently
  • This is a RubyLLM bug, not my application code

What's broken?

The documentation states that the on_new_message-callback is called "just before the API request". However, this does not appear to be the case, but might not be noticeable unless you are using a reasoning model like GPT-5.

As a result, during the thinking/reasoning phase, we don't have an empty message to show the user a thinking state, and we don't want to create one in the controller, as it would be included in the messages sent to the LLM when calling chat.complete in the background job, as well as chat.complete would create another empty message when it receives the first chunk of data.

How to reproduce

Note: This appears to only really be an issue for reasoning models like GPT-5, since they may be thinking for a while.

# In controller:

@chat = Chat.find(params[:chat_id])

# Create and persist the user message immediately
@chat.create_user_message(params[:content])

# Process AI response in background
ChatStreamJob.perform_later(@chat.id)
# In background job:

def perform(chat_id)
  chat = Chat.find(chat_id)

  chat.on_new_message do
    puts "Assistant is typing..."
  end

  # Expected to see "Assistant is typing..." in the console right after `chat.complete` executes
  chat.complete do |chunk|
    # GPT-5 has been thinking for 30 seconds. When the first chunk is received, I finally see "Assistant is typing..."
    assistant_message = chat.messages.last
    if chunk.content && assistant_message
      assistant_message.broadcast_append_chunk(chunk.content)
    end
  end
end

Expected behavior

chat.complete calls the on_new_message-callback before the API request, resulting in a new empty message being created.

What actually happened

chat.complete calls the on_new_message-callback after the first chunk has been received from the API request, potentially resulting in a very long delay (30 seconds+) before the message is created.

Environment

  • Ruby version: 3.4.5
  • RubyLLM version: 1.6.4
  • Provider: OpenAI
  • Model: GPT-5
  • OS: MacOS

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions