-
Notifications
You must be signed in to change notification settings - Fork 2.4k
fix: add cache reporting support for OpenAI-Native provider #7602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add normalizeUsage method to properly extract cache tokens from Responses API - Support both detailed token shapes (input_tokens_details) and legacy fields - Calculate cache read/write tokens with proper fallbacks - Include reasoning tokens when available in output_tokens_details - Ensure accurate cost calculation using uncached input tokens This fixes the issue where caching information was not being reported when using the OpenAI-Native provider with the Responses API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes cache reporting functionality for the OpenAI-Native provider by implementing proper extraction and normalization of cache-related token usage from the Responses API.
- Added
normalizeUsagemethod to properly extract cache token information from various response formats - Enhanced fallback handling for compatibility across different API versions and transport methods
- Improved cost calculation to use uncached input tokens and avoid double-counting
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! I've reviewed the changes and found that the implementation correctly addresses the cache reporting issue. The fallback patterns and backward compatibility are well handled. I have a few suggestions inline that could improve the implementation.
- Add fallback to derive total input tokens from details when totals are missing - Remove unused convertToOpenAiMessages import - Add comment explaining cost calculation alignment with Gemini provider - Add comprehensive test coverage for normalizeUsage method covering: - Detailed token shapes with cached/miss tokens - Legacy field names and SSE-only events - Edge cases including missing totals with details-only - Cost calculation with uncached input tokens
- Remove incorrect fallback to missFromDetails for cache write tokens - Fix cost calculation to pass total input tokens (calculateApiCostOpenAI handles subtraction) - Improve readability by extracting cache detail checks to intermediate variables - Remove redundant ?? undefined - Update tests to reflect correct behavior (miss tokens are not cache writes) - Add clarifying comments about cache miss vs cache write tokens
daniel-lxs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Description
This PR fixes an issue where caching information was not being reported when using the OpenAI-Native provider with the Responses API.
Problem
The OpenAI-Native provider was not properly extracting and reporting cache token usage from the Responses API, which meant users couldn't see when their requests were using cached content or how many tokens were being cached.
Solution
Added a
normalizeUsagemethod that properly extracts cache-related token information from the Responses API response, with support for both detailed token shapes and legacy field names for compatibility.Key Changes:
normalizeUsagemethod to extract and normalize usage data from various response formatsinput_tokens_details,output_tokens_details) when availableWhy Fallbacks Are Needed
Even though we exclusively use the Responses API, fallbacks are necessary because:
Testing
Impact
Users will now be able to see:
Fixes the unreported issue with OpenAI-Native provider cache reporting.
Important
Adds cache reporting support for OpenAI-Native provider by implementing
normalizeUsagemethod to handle various response formats and cache-related fields.normalizeUsagemethod inopenai-native.tsto extract and normalize cache-related token information from Responses API.input_tokens_details,output_tokens_details) and legacy field names.openai-native-usage.spec.tsto verify handling of detailed token shapes, legacy fields, SSE events, and edge cases.This description was created by
for 3e073f3. You can customize this summary. It will automatically update as commits are pushed.