Skip to content

[Feature]: Chunked prefill + lora #4995

@rkooo567

Description

@rkooo567

🚀 The feature, motivation and pitch

Currently lora doesn't work with chunked prefill because some of lora index logic doesn't cover the case where sampling is not required. This also means lora is not working with sampling_params do_sample=True.

We need to add test cases for these. WIP #4994

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

feature requestNew feature or requeststaleOver 90 days of inactivity

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions