Major Features
Intel® Extension for OpenXLA* is an Intel optimized PyPI package to extend official OpenXLA framework on Intel GPUs. Built on PJRT plugin mechanism, it enables seamless execution of JAX models on Intel® Data Center GPU Max Series.
This release contains following major features:
JAX Upgrade:
- Upgraded JAX to v0.5.0, ensuring compatibility between jax and jaxlib.
- Added scale-up support for Single Program, Multiple Data (SPMD) execution using Intel® oneAPI Collective Communications Library enabling multi-device distributed training and inference across Intel GPUs within a single host.
- For details on JAX and jaxlib versioning, refer to: How are jax and jaxlib versioned.
intel-extension-for-openxla jaxlib jax 0.7.0 0.5.0 0.5.0
Toolkit & Driver Support:
- Intel® oneAPI Base Toolkit 2025.2.1 support added.
- Upgraded driver: Supports LTS release 2523.31
Library & Compatibility Enhancements
- oneDNN v3.7 support added.
- Supports Python versions: 3.10, 3.11, 3.12, 3.13.
Known Caveats
- Flan T5 and Gemma models have a dependency on Tensorflow-Text, which doesn't support Python 3.13.
- The Multi-process API is being introduced for the first time. As this is an initial integration, some unit tests and models may fail at higher tile counts. These issues are known and will be addressed in future releases. If you encounter failures in your workflow, please open a GitHub issue
- Known model failures: GPT-J inference on 4 Tiles (single tile per GPU), Flan-T5 XL-3B inference, Gemma-7B fine-tuning on 8 Tiles (single tile per GPU)
- The following JAX unit tests (UTs) must be skipped when using Intel Extension for OpenXLA:
- Mock GPU Tests:
mock_gpu_test&mock_gpu_topology_test(Sycl device not supported) - Pallas Tests:
gpu_ops_test,pallas_shape_poly_test,pallas_vmap_test(Pallas calls are not currently supported for sycl backend) - Profiling Tests:
pgle_test(Sycl device not supported in TensorFlow profiling APIs) - FFI Tests (JAXPR to MLIR lowering rule is presently missing for sycl backend)
- BCOOTest failure: A UT in the test file
sparse_bcoo_bcsr_test.py(test_bcoo_mul_sparse5) fails with rolling driver version 2507.12 due to a known issue. - Memories Tests & Layout Tests: host offloading is not supported on Intel GPUs
- Certain UTs in pjit_test, pmap_test, shard_map_test, shard_alike_test, array_test
- Mock GPU Tests:
Deprecations
- JAX v0.4.38 is no longer supported.
- Refer to the JAX change log for migration steps.
- If your application requires JAX v0.4.38, downgrade the Intel Extension for OpenXLA version to v0.6.0.
- Intel® Data Center GPU Flex Series is no longer supported
- Intel® Arc™ B-Series Graphics Series is not formally validated. Please file a GitHub issue if support is needed.
Documentation
- Introduction to Intel® Extension for OpenXLA*
- Accelerating JAX models on Intel GPUs via PJRT
- How JAX and OpenXLA Enabled an Argonne Workload and Quality Assurance on Aurora Supercomputer
- JAX and OpenXLA - Part 1: Execution Process & Underlying Logic
- JAX and OpenXLA - Part 2: Execution Process & Underlying Logic
- How are jax and jaxlib versioned?