[OVEP] Enable stateful mode for Phi-silica models #821

Kotomi-Du · 2025-10-03T00:23:58Z

Description

Recognize other LLM models specifically for phi-silica models to trigger making stateful model.

Motivation and Context

Combined with #830 and #831 , these changes improved the memory footprint and performance for Phi-Silica workload. Without these changes, it consumed 16GB memory and got 1 fps when running the workload on OVEP GPU backend. After the change, the memory usage reduced to 3.7GB and the performance achieves 16fps.

Open

Should OVEP add an API in provider option to let user decide if they want to make the model stateful? If so, we don't need to hardcode the input name for specific models.

Kotomi-Du · 2025-10-03T00:27:37Z

onnxruntime/core/providers/openvino/ov_interface.cc

-  if (gpu_or_npu) {
-    prefill_use_full_chat_history = true;
-  }
+  // bool gpu_or_npu = ((device.find("NPU") != std::string::npos) || (device.find("GPU") != std::string::npos));


need to discuss with ORT-GenAI team how to handle this logic

Co-author: Beheshti, Nazanin

Kotomi-Du changed the base branch from master to ovep-develop October 3, 2025 00:24

Kotomi-Du marked this pull request as draft October 3, 2025 00:26

Kotomi-Du commented Oct 3, 2025

View reviewed changes

Kotomi-Du requested a review from mdvoretc-intel October 3, 2025 00:33

Kotomi-Du added 2 commits October 10, 2025 16:36

trigger stateful path for Phisilica model

65bbecc

Co-author: Beheshti, Nazanin

unify the code

1e132f3

Kotomi-Du force-pushed the make_stateful_phisilica branch from 0fe0302 to 1e132f3 Compare October 11, 2025 00:29

Kotomi-Du marked this pull request as ready for review October 11, 2025 00:30

Kotomi-Du requested review from RyanMetcalfeInt8, ankitm3k and preetha-intel October 11, 2025 00:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[OVEP] Enable stateful mode for Phi-silica models #821

[OVEP] Enable stateful mode for Phi-silica models #821

Uh oh!

Kotomi-Du commented Oct 3, 2025 •

edited

Loading

Uh oh!

Kotomi-Du Oct 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[OVEP] Enable stateful mode for Phi-silica models #821

Are you sure you want to change the base?

[OVEP] Enable stateful mode for Phi-silica models #821

Uh oh!

Conversation

Kotomi-Du commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Open

Uh oh!

Kotomi-Du Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Kotomi-Du commented Oct 3, 2025 •

edited

Loading