-
Notifications
You must be signed in to change notification settings - Fork 35
Closed
Description
For Gemini models accessed via VertexAI, the completion_tokens field of the UsageInfo object seems to contain the number of input tokens, but I think it should be the number of output tokens.
Here is a demonstrating example:
val modelId = "gemini-2.0-flash"
val completionService: OpenAIChatCompletionService =
VertexAIServiceFactory.asOpenAI(
projectId = ...,
location = ...
)
val response = Await.result(completionService.createChatCompletion(
Seq(domain.UserMessage("Please output the exact string 'ABC' excluding the quotes and nothing else")),
CreateChatCompletionSettings(modelId)
),
Duration.Inf
)
val usage = response.usage.get
val inputTokens = usage.prompt_tokens // returns 14 (OK)
val outputTokens = usage.completion_tokens.get // returns 14 (EXPECTED: 2)
val totalTokens = usage.total_tokens // returns 16 (OK)This seems to be a regression, as this piece of code was returning correct numbers in version 1.1.0.
Metadata
Metadata
Assignees
Labels
No labels