-
Notifications
You must be signed in to change notification settings - Fork 19.6k
Simulated OpenVINO Backend for Testing Unmerged PR Features with Memory Profiling #21500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @Mohamed-Ashraf273, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly advances the OpenVINO backend for Keras by implementing critical missing operations and introducing direct model export capabilities. The changes aim to improve the performance and memory efficiency of Keras models when using OpenVINO, addressing reported high memory usage issues, and enabling a more seamless workflow for deploying Keras models with OpenVINO.
Highlights
- Expanded OpenVINO Backend Operations: Implemented several previously unsupported Keras operations for the OpenVINO backend, including
slice_update
,repeat
,tri
,tril
,triu
, andcategorical
functions. This significantly broadens the range of Keras models and operations that can be executed efficiently with OpenVINO. - Improved OpenVINO Tensor Handling: Enhanced the
OpenVINOKerasTensor
class to support conversion to NumPy arrays via__array__
andnumpy()
methods, and enabled direct handling ofov.Output
objects. This improves compatibility, debuggability, and integration within the Keras ecosystem. - Direct OpenVINO Model Export: Introduced a new
export_openvino
utility and integrated it into thekeras.models.Model.export()
method. Keras models can now be directly exported to the OpenVINO Intermediate Representation (IR) format (.xml
and.bin
) for optimized inference on Intel hardware. - Test Suite Adjustments and Progress: Updated the OpenVINO backend's test exclusion lists, enabling several previously skipped NumPy and core operations tests (e.g.,
test_tri
,test_repeat
,test_slice_update
). New test files for the OpenVINO export functionality were added, demonstrating successful export and inference for various model types and input structures.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enhances the OpenVINO backend by implementing previously unsupported operations and adding a new feature to export Keras models to the OpenVINO IR format. The code is generally of high quality. My review focuses on improving maintainability by suggesting refactoring for complex functions and duplicated code, fixing a minor issue in a test, and removing non-source files from the PR. Additionally, I've highlighted some tests that have been excluded, which may indicate areas needing further attention.
testing files/gemma_test.txt
Outdated
============================= test session starts ============================== | ||
platform linux -- Python 3.12.3, pytest-8.4.0, pluggy-1.6.0 -- /home/mohamed-ashraf/Desktop/GSoC2025/env/bin/python | ||
cachedir: .pytest_cache | ||
rootdir: /home/mohamed-ashraf/Desktop/GSoC2025/keras-hub | ||
configfile: pytest.ini | ||
plugins: cov-6.1.1 | ||
collecting ... collected 15 items | ||
|
||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::TestCase::test_session SKIPPED [ 6%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_all_presets SKIPPED [ 13%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_cache_correctness SKIPPED [ 20%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_causal_lm_basics SKIPPED [ 26%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_early_stopping PASSED [ 33%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_flash_attention_call SKIPPED [ 40%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_generate PASSED [ 46%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_generate_compilation PASSED [ 53%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_generate_with_bfloat16 PASSED [ 60%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_multitoken_stopping PASSED [ 66%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_saved_model SKIPPED [ 73%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_score_layer_intercept_fn_exfiltration PASSED [ 80%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_score_logits PASSED [ 86%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_score_loss SKIPPED [ 93%] | ||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_session PASSED [100%] | ||
|
||
=============================== warnings summary =============================== | ||
../../../../../usr/lib/python3.12/multiprocessing/popen_fork.py:66 | ||
../../../../../usr/lib/python3.12/multiprocessing/popen_fork.py:66 | ||
/usr/lib/python3.12/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=450338) is multi-threaded, use of fork() may lead to deadlocks in the child. | ||
self.pid = os.fork() | ||
|
||
../env/lib/python3.12/site-packages/openvino/runtime/__init__.py:10 | ||
/home/mohamed-ashraf/Desktop/GSoC2025/env/lib/python3.12/site-packages/openvino/runtime/__init__.py:10: DeprecationWarning: The `openvino.runtime` module is deprecated and will be removed in the 2026.0 release. Please replace `openvino.runtime` with `openvino`. | ||
warnings.warn( | ||
|
||
../env/lib/python3.12/site-packages/_pytest/config/__init__.py:1474 | ||
/home/mohamed-ashraf/Desktop/GSoC2025/env/lib/python3.12/site-packages/_pytest/config/__init__.py:1474: PytestConfigWarning: Unknown config option: env | ||
|
||
self._warn_or_fail_if_strict(f"Unknown config option: {key}\n") | ||
|
||
../env/lib/python3.12/site-packages/google/protobuf/internal/well_known_types.py:91 | ||
/home/mohamed-ashraf/Desktop/GSoC2025/env/lib/python3.12/site-packages/google/protobuf/internal/well_known_types.py:91: DeprecationWarning: datetime.datetime.utcfromtimestamp() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.fromtimestamp(timestamp, datetime.UTC). | ||
_EPOCH_DATETIME_NAIVE = datetime.datetime.utcfromtimestamp(0) | ||
|
||
keras_hub/src/models/gemma/gemma_causal_lm_test.py::GemmaCausalLMTest::test_session | ||
/usr/lib/python3.12/unittest/case.py:690: DeprecationWarning: It is deprecated to return a value that is not None from a test case (<bound method TensorFlowTestCase.test_session of <keras_hub.src.models.gemma.gemma_causal_lm_test.GemmaCausalLMTest testMethod=test_session>>) | ||
return self.run(*args, **kwds) | ||
|
||
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html | ||
================== 8 passed, 7 skipped, 6 warnings in 11.34s =================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -815,10 +842,152 @@ def prepare_slice_index(val): | |||
|
|||
|
|||
def slice_update(inputs, start_indices, updates): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keras/src/backend/openvino/numpy.py
Outdated
x = get_ov_output(x) | ||
ov_type = x.get_element_type() | ||
shape = ov_opset.shape_of(x, Type.i32) | ||
zero_const = ov_opset.constant(0, Type.i32) | ||
minus2 = ov_opset.constant([-2], Type.i32) | ||
minus1 = ov_opset.constant([-1], Type.i32) | ||
M = ov_opset.squeeze(ov_opset.gather(shape, minus2, zero_const), zero_const) | ||
N = ov_opset.squeeze(ov_opset.gather(shape, minus1, zero_const), zero_const) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keras/src/export/openvino_test.py
Outdated
class TwoInputsModel(models.Model): | ||
def call(self, x, y): | ||
return x + y | ||
|
||
def build(self, y_shape, x_shape): | ||
self.built = True | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -1,6 +1,5 @@ | |||
NumPyTestRot90 | |||
NumpyArrayCreateOpsCorrectnessTest::test_eye | |||
NumpyArrayCreateOpsCorrectnessTest::test_tri | |||
NumpyDtypeTest::test_absolute_bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -32,7 +32,7 @@ keras/src/ops/linalg_test.py | |||
keras/src/ops/nn_test.py | |||
keras/src/optimizers | |||
keras/src/quantizers | |||
keras/src/random | |||
keras/src/random/seed_generator_test.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CoreOpsCallsTests::test_switch_basic_call | ||
CoreOpsCallsTests::test_unstack_basic_functionality | ||
CoreOpsCorrectnessTest::test_associative_scan | ||
CoreOpsCorrectnessTest::test_cond | ||
CoreOpsCorrectnessTest::test_dynamic_slice | ||
CoreOpsCorrectnessTest::test_fori_loop | ||
CoreOpsCorrectnessTest::test_map | ||
CoreOpsCorrectnessTest::test_scan | ||
CoreOpsCorrectnessTest::test_scatter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #21500 +/- ##
==========================================
- Coverage 82.87% 82.72% -0.15%
==========================================
Files 567 567
Lines 56073 56311 +238
Branches 8756 8800 +44
==========================================
+ Hits 46470 46585 +115
- Misses 7459 7566 +107
- Partials 2144 2160 +16
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
4327cc8
to
6855d8b
Compare
9217787
to
930690a
Compare
930690a
to
53d7143
Compare
Operating System
Ubuntu 22.04 (LTS)
Device used for inference
CPU
OpenVINO installation
PyPi
Programming Language
Python
Hardware Architecture
x86 (64 bits)
Model used
GPT-2
Model quantization
No
Performance issue description
During my GSoC project, I've faced this issue:
Running the generate step using OpenVINO backend gives a very high memory usage for some reason, based on these PRs:
Keras: #21491
Keras_hub: keras-team/keras-hub#2310
for OpenVINO the model is being serialized with size:
for
for OpenVINO the model is being serialized with size:
Step-by-step reproduction
using these PRs:
Keras: #21491
Keras_hub: keras-team/keras-hub#2310
run that code:
Issue submission checklist