-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Labels
Feature requestNew feature requestNew feature request
Description
Initial Checks
- I confirm that I'm using the latest version of Pydantic AI
- I confirm that I searched for my issue in https://github.com/pydantic/pydantic-ai/issues before opening this issue
Description
A request_tokens_limit check happens after the model returns a response, by which time the tokens have already been used and a response has already been produced, but the UsageLimitExceeded exception prevents the response from returning.
Ideally this would happen before the request is made. Which would require estimating tokens on the client side, eg using tiktoken for openai models or similar.
If this pre-request check is deemed infeasible perhaps the response can be returned nonetheless, since it has been produced, and only subsequent requests raise an exception.
eg:
Traceback (most recent call last):
File "/Users/mcbob/.venv/lib/python3.11/site-packages/opentelemetry/trace/__init__.py", line 587, in use_span
yield span
File "/Users/mcbob/.venv/lib/python3.11/site-packages/pydantic_graph/graph.py", line 261, in iter
yield GraphRun[StateT, DepsT, RunEndT](
File "/Users/mcbob/.venv/lib/python3.11/site-packages/pydantic_ai/agent.py", line 683, in iter
yield agent_run
File "/Users/mcbob/.venv/lib/python3.11/site-packages/pydantic_ai/agent.py", line 451, in run
async for _ in agent_run:
File "/Users/mcbob/.venv/lib/python3.11/site-packages/pydantic_ai/agent.py", line 1798, in __anext__
next_node = await self._graph_run.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mcbob/.venv/lib/python3.11/site-packages/pydantic_graph/graph.py", line 810, in __anext__
return await self.next(self._next_node)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mcbob/.venv/lib/python3.11/site-packages/pydantic_graph/graph.py", line 783, in next
self._next_node = await node.run(ctx)
^^^^^^^^^^^^^^^^^^^
File "/Users/mcbob/.venv/lib/python3.11/site-packages/pydantic_ai/_agent_graph.py", line 270, in run
return await self._make_request(ctx)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mcbob/.venv/lib/python3.11/site-packages/pydantic_ai/_agent_graph.py", line 329, in _make_request
return self._finish_handling(ctx, model_response, request_usage)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mcbob/.venv/lib/python3.11/site-packages/pydantic_ai/_agent_graph.py", line 356, in _finish_handling
ctx.deps.usage_limits.check_tokens(ctx.state.usage)
File "/Users/mcbob/.venv/lib/python3.11/site-packages/pydantic_ai/usage.py", line 112, in check_tokens
raise UsageLimitExceeded(
pydantic_ai.exceptions.UsageLimitExceeded: Exceeded the request_tokens_limit of 5000 (request_tokens=11725)
Example Code
agent_response = await self.agent.run(
user_prompt=last_user_message,
message_history=history,
usage_limits=UsageLimits(request_tokens_limit=5000),
)
Python, Pydantic AI & LLM client version
0.2.6
Metadata
Metadata
Assignees
Labels
Feature requestNew feature requestNew feature request