Incorrect completion tokens in UsageInfo for Gemini via VertexAI

For Gemini models accessed via VertexAI, the `completion_tokens` field of the `UsageInfo` object seems to contain the number of _input_ tokens, but I think it should be the number of output tokens.

Here is a demonstrating example:
```scala
val modelId = "gemini-2.0-flash"

val completionService: OpenAIChatCompletionService =
  VertexAIServiceFactory.asOpenAI(
    projectId = ...,
    location = ...
  )

val response = Await.result(completionService.createChatCompletion(
    Seq(domain.UserMessage("Please output the exact string 'ABC' excluding the quotes and nothing else")),
    CreateChatCompletionSettings(modelId)
  ),
  Duration.Inf
)

val usage = response.usage.get

val inputTokens = usage.prompt_tokens           // returns 14 (OK)
val outputTokens = usage.completion_tokens.get  // returns 14 (EXPECTED: 2)
val totalTokens = usage.total_tokens            // returns 16 (OK)
```

This seems to be a regression, as this piece of code was returning correct numbers in version 1.1.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect completion tokens in UsageInfo for Gemini via VertexAI #96

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect completion tokens in UsageInfo for Gemini via VertexAI #96

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions