Skip to content

Conversation

@MasterJH5574
Copy link
Contributor

This PR udpates PagedKVCache to initialize one more page than specified via constructor. The reason is that applications usually depends the number of free pages (returned from GetNumAvailablePages) to decide the KV cache operation policy. If there is no this extra page, the KV cache will tell "no available" pages even when the last allocated pages are not full, which may give the applications an illusion that the KV cache is already completely full, and cause further issues.

This PR udpates PagedKVCache to initialize one more page than
specified via constructor. The reason is that applications usually
depends the number of free pages (returned from `GetNumAvailablePages`)
to decide the KV cache operation policy. If there is no this extra
page, the KV cache will tell "no available" pages even when the
last allocated pages are not full, which may give the applications
an illusion that the KV cache is already completely full, and cause
further issues.
@tqchen tqchen merged commit a7be540 into apache:main Apr 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants