Repo for exploratory work for building and solving multiple linear systems in batch mode using Julia and GPU acceleration (in optimization pipelines).
We consider a batch of
At each Newton / interior-point iteration, the step
where for each
-
$W_k ;=; \nabla^2_{xx} f_{\theta_k}(x_k);+;\displaystyle\sum_i y^i_k, \nabla^2_{xx} c^i_{\theta_k}(x_k)$ (Lagrangian Hessian), -
$J_k ;=; \nabla_x c_{\theta_k}(x_k)$ (constraint Jacobian), -
$b_k ;=; \nabla_x f_{\theta_k}(x_k) ;+; \nabla_x c_{\theta_k}(x_k)$ , -
$h_k ;=; -,c_{\theta_k}(x_k)$ , -
$D_k ;=; \text{diag}(x_k),\text{diag}(y_k^{-1})$ .
-
$W_k$ and$J_k$ are sparse. - Across the batch, problems share the same sparsity pattern / expression graph (parameters change, structure doesn’t).
- This enables batched GPU solves and reuse of symbolic analysis / factorization patterns.
If
In sensitivity analysis, the left-hand side (KKT matrix) remains the same as above, but the right-hand side changes to encode derivatives with respect to problem parameters.
Formally, sensitivities follow from the Implicit Function Theorem applied to the KKT system. For a single problem (dropping the
where:
-
$M$ is the KKT Jacobian (same block matrix structure as above), -
$N$ is the Jacobian of the KKT system w.r.t. parameters$p$ , -
$\nabla_p y(p)$ gives solution sensitivities.
This means:
-
Factorization of
$M$ can be reused, - Only the RHS
$N$ changes when computing sensitivities for different parameters, - Forward mode (directional sensitivities) and reverse mode (adjoint sensitivities) both reduce to solving new linear systems with the same LHS but different RHS vectors.
- In optimization pipelines with parameters (e.g., demand in power systems, risk limits in portfolios, targets in robotics), this reuse dramatically reduces cost.
- Computing solution sensitivities is key to embedding optimization layers inside machine learning and control systems.
- Batched GPU linear algebra makes both the primal Newton steps and sensitivity solves efficient.
This repository contains code developed during the PSC/CMU/Pitt Open Hackathon 2025. The goal was to explore efficient methods for solving multiple linear systems simultaneously using GPU acceleration.
LinearSolverDatasets
Folder - Contains the scripts to generate datasets for testing the batch linear solves.LinearSolverBenchmarks
Folder - Contains the benchmarking scripts to evaluate the performance of different batch linear solver settings in CUDSS.benchmark_i
Folder - Contains the results of theith
benchmark.
All the instructions below assume you are on UNIX system and have cloned this repository. Unless specified otherwise, all commands are executed from the root of the repository.
-
Download and install julia on your machine.
💡 If you are on a PACE cluster, simply execute
module load julia
-
Install dependencies
⚠️ This command will build Julia's CUDA wrapper, which should be done on a CUDA-compatible system. If you are on a PACE cluster, be sure to execute this on a GPU node⚠️ julia --project=. -e 'using Pkg; Pkg.instantiate()'
-
Make sure you have
uv
installed💡 If you are on a PACE cluster, simply execute
module load uv
-
Create a
data
folder at the root of this repositorymkdir data
⚠️ The benchmark datasets take up a few GBs of disk space, this can be an issue if you have a limited storage quota⚠️ 💡 If you are on a PACE cluster, you can execute the following command:
mkdir ~/scratch/LinearSystemsDataset ln -s ~/scratch/LinearSystemsDataset data
-
Download the linear systems dataset as follows
uv run download.py
This will download the datasets and store them in the
data
folder. If you'd like to download only one of the datasets, pass either--madnlp
or--diffopt
to pick which one.
- Make sure you have NSight Profiler installed
💡
which nsys
nsys
comes pre-installed on PACE's GPU nodes -
nsys profile julia --project=. LinearSolverBenchmarks/benchmark_0/run_linear_solve_benchmark.jl --dataset LinearSolverDatasets/DiffOpt/ac_dataset_pglib_opf_case300_ieee.jld2 --precision double --nvtx
- Download the
.nsys-rep
files and open them with NSight Systems
To benchmark the performance of various linear solvers, run
julia --project=. LinearSolverBenchmarks/benchmark_0/run_linear_solve_benchmark.jl --dataset LinearSolverDatasets/DiffOpt/ac_dataset_pglib_opf_case300_ieee.jld2 --precision double
DiffOpt.jl is a Julia package for differentiable optimization, allowing users to integrate optimization problems into machine learning workflows. It provides tools for calculating gradients through optimization problems, making it easier to incorporate them into end-to-end learning systems.
DiffOpt.jl
newest version is still not registered in the Julia package registry. To install it, you can use the following command in Julia (after you have already activated and instantiated the environment):
using Pkg
Pkg.add(url="https://github.com/jump-dev/DiffOpt.jl", rev="master")
To extract the sensitivity system matrices using DiffOpt.jl
, you can use the auxiliary function (in LinearSolverDatasets/DiffOpt/diffopt_utils.jl
):
"""
extract_sensitivity_system(model::JuMP.Model)
Extracts the sensitivity system matrices M and N from a JuMP model that uses DiffOpt.
These matrices can be used to analyze how changes in parameters affect the solution of the optimization problem.
Requires the model to be feasible, that it has been solved with DiffOpt wrapper, is nonlinear, and the user has called:
- `DiffOpt.set_forward_parameter`
- `DiffOpt.forward_differentiate!(model)`
# Steps
1. Access the underlying DiffOpt model and its cache: `model.moi_backend.diff.model`.
2. Structure the solution and bounds using DiffOpt's internal function: `_compute_solution_and_bounds`.
3. Build the sensitivity matrices M and N using DiffOpt's internal function: `_build_sensitivity_matrices`.
"""
function extract_sensitivity_system(model::JuMP.Model)