Skip to content

Conversation

yashv6655
Copy link

Description

Fixes a bug where trim_messages with strategy="last" could create invalid message histories by orphaning ToolMessages when their corresponding AIMessage with tool_calls was trimmed away.

Issue

When trimming message history, if a ToolMessage was included in the trimmed result but its corresponding AIMessage (containing the tool call that the ToolMessage responds to) was removed, this created an orphaned ToolMessage with a tool_call_id that references a non-existent tool call. This invalid message history would be rejected by most LLM APIs.

Fix

Added a _remove_orphaned_tool_messages() helper function that:

  1. Scans the trimmed messages for all valid tool_call_ids from AIMessages
  2. Filters out any ToolMessages whose tool_call_id doesn't match a valid tool call
  3. Returns a cleaned message list with orphaned ToolMessages removed

This function is called in _first_max_tokens() before returning, which fixes both strategy="first" and strategy="last" (since "last" internally uses "first" with reversed messages).

Example

Before (broken):

trimmed_messages = trim_messages(messages, strategy="last", token_counter=len, max_tokens=5)
# Returns: [ToolMessage(tool_call_id="abc123"), HumanMessage(...), ...]
# Invalid! ToolMessage references a tool call that's not in the trimmed history

After (fixed):

trimmed_messages = trim_messages(messages, strategy="last", token_counter=len, max_tokens=5)
# Returns: [HumanMessage(...), AIMessage(...), ...]
# Valid! Orphaned ToolMessage was automatically removed

Issue

Resolves #33245

Dependencies

None - this is a pure bug fix with no new dependencies.


Testing

  • Added 5 comprehensive unit tests covering various orphaning scenarios

@yashv6655 yashv6655 requested a review from eyurtsev as a code owner October 4, 2025 01:13
@github-actions github-actions bot added core Related to the package `langchain-core` and removed core Related to the package `langchain-core` labels Oct 4, 2025
@yashv6655 yashv6655 changed the title Fix(core): remove orphaned ToolMessages in trim_message fix(core): remove orphaned ToolMessages in trim_message Oct 4, 2025
@github-actions github-actions bot added the core Related to the package `langchain-core` label Oct 4, 2025
Copy link

codspeed-hq bot commented Oct 4, 2025

CodSpeed WallTime Performance Report

Merging #33265 will not alter performance

Comparing yashv6655:fix/core/trim-messages-tool-call-orphaning (891903a) with master (7f5be6b)1

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 13 untouched

Footnotes

  1. No successful run was found on master (46b87e4) during the generation of this report, so 7f5be6b was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@eyurtsev
Copy link
Collaborator

eyurtsev commented Oct 4, 2025

Hi @yashv6655! Thank you for the PR. I haven't reviewed in detail yet, but noticed that the issue is missing a parameter

If you look at the how-to docs:

https://python.langchain.com/docs/how_to/trim_messages/#trimming-based-on-message-count

We recommend adding: end_on explicitly so only valid chat histories are produced.

Could you confirm whether this resolves the issue for you?

I'm basically wondering whetherthis is a bug vs. a devx issue (i.e., the API isn't intuitive)

@eyurtsev eyurtsev self-assigned this Oct 4, 2025
@yashv6655
Copy link
Author

@eyurtsev Thanks for the feedback! You're right that end_on=("human", "tool") prevents the specific issue in the bug report.

However, I believe the fix is still needed:

1. The parameter is optional

The bug report used:

trim_messages(messages, strategy="last", token_counter=len, max_tokens=5)

The docs recommend end_on=("human", "tool"), but it's optional. Users don't realize it's necessary to prevent invalid histories.

2. end_on doesn't prevent all orphaning cases

Even with proper usage, orphaned ToolMessages can still occur:

messages = [
    HumanMessage("start"),
    AIMessage(tool_calls=[{"id": "tool1", ...}]),
    ToolMessage(tool_call_id="tool1"),
    AIMessage(tool_calls=[{"id": "tool2", ...}]),
    ToolMessage(tool_call_id="tool2"),
    HumanMessage("end"),
]

trim_messages(
    messages,
    max_tokens=4,
    token_counter=len,
    strategy="last",
    end_on=("human", "tool"),
)
# Result: [ToolMessage(tool1), AIMessage(tool2), ToolMessage(tool2), HumanMessage]
# ToolMessage(tool1) is orphaned

end_on controls the final message type, but doesn't prevent orphaning in the middle of the trimmed sequence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Related to the package `langchain-core`
Projects
None yet
Development

Successfully merging this pull request may close these issues.

trim_messages returning invalid messages history
2 participants