AURA is a sound and precise abstract interpretation framework for certifying bounds on posterior distributions in probabilistic programs—even under data uncertainty.
See our paper for details:
Huang, Zixin; Laurel, Jacob; Misailovic, Sasa; Dutta, Saikat. Precise Abstract Interpretation of Probabilistic Programs with Interval Data Uncertainty. SAS 2025.
- Abstract Interpretation Framework
- Gradient-Based Abstract Transformer Engine
- Provably Sound and Precise Interval Bound Computation (see paper)
- Parallel and GPU-Accelerated Execution
- Benchmarking Scripts
The AURA source code consists of two main components:
-
Stan-to-PyTorch Compiler (Java): Converts Stan programs into differentiable PyTorch modules. This is part is built on existing work Storm-IR (FSE 19) and AQUA (ATVA 21).
-
Abstract Interpretation with Interval Data Uncertainty (Python): Performs gradient-based interval analysis and posterior certification using PyTorch.
AURA/
├── src/ # [Component 1] Java-based Stan-to-PyTorch compiler
│ └── main/
│ ├── java/aqua/ # Java source code for AST parsing and model translation
│ └── resources/
├── antlr-4.7.1-complete.jar # ANTLR for parser generation
├── lib/ # Dependency: Storm-IR (FSE 19)
├── pom.xml # Maven build file
├── python_models/ # [Component 2] PyTorch-based analysis framework
│ ├── param_optimizer.py # Gradient-based interval analysis module
│ ├── utils.py
│ ├── driver.py # Main entry point for running analysis
│ ├── ... # Other generated python file for each benchmark model
│ └── run_all.sh # Script for batch-running benchmarks
├── benchmarks/ # Benchmark programs and evaluation scripts
│ ├── benchmarks.txt # Benchmark list
│ └── stan_bench/ # Stan programs used in evaluation
└── README.md # Project documentation
Prerequisites:
-
Java
Check installation with
java -version
. If not installed, on Ubuntu you may trysudo apt -y update; sudo apt install openjdk-8-jdk
-
Maven
Check installation with
mvn --version
. If not installed, on Ubuntu you may trysudo apt -y update; sudo apt install maven
-
PyTorch and Scipy
pip install torch scipy
Install dependencies and build AURA:
- In the root directory of this repo, run
In the end it should print
mvn package -DskipTests=true
BUILD SUCCESS
.
java -cp "target/aqua-1.0.jar:lib/storm-1.0.jar" aqua.PytorchCompilerMain
This will convert all the benchmarks under AURA/benchmarks
.
To convert new benchmarks, follow the format of existing benchmarks and
modify src/main/java/aqua/PytorchCompilerMain.java
to add a new benchmark.
After running this step, we should see the converted python files for all the benchmarks under AURA/python_models/
.
Note the generated python files under python_models/
only contain the logic to compute the log density of the unnormalized posterior.
The module for finding the sound bounds is defined in python_models/driver.py
and python_models/param_optimizer.py
.
We provide a script AURA/python_models/run_all.sh
to find the posterior bounds for all the benchmark models.
cd python_models
./run_all.sh accu
This will compute bounds for each latent parameter in the models and print them to the console.
Too much terminal output? Don’t worry — all results are automatically saved under:
cd ./results_cr
For example, if the benchmark model lightspeed
has a latent parameter beta[0]
, the following output file will be generated lightspeed_param_beta[0]_cuda:0_200_iter_100.csv
.
This CSV file contains three rows:
- X-axis splits
These are the interval boundaries: x_0, x_1, ..., x_n+1 (Note: AURA uses uneven splits to improve analysis accuracy.)
- Lower bounds
One value per interval: l_0, l_1, ..., l_n These represent the lower bound of the posterior in each interval [x_i, x_{i+1}].
- Upper bounds
One value per interval: u_0, u_1, ..., u_n These represent the upper bound over the same intervals.
The bound over each interval [x_i, x_{i+1}] is [l_i, u_i].
To help visualize the bounds, we provide a Mathematica notebook: example-1.nb
.
Notice there's another file saved: lightspeed.result_cuda:0_200_iter_100
.
This file contains serialized arrays for the beta0Grid
and beta0Bounds
.
To visualize:
-
Open example-1.nb
-
Replace the
beta0Grid
andbeta0Bounds
in the notebook with the values from the result file -
Run the whole Mathematica notebook (remember to rerun the definitions of
getAURAPlots
,beta0Grid
andbeta0Bounds
), and then call the function
getAURAPlots[beta0Grid, beta0Bounds]
This will display the posterior bounds as shaded blocks over the latent space.
To evaluate bounds under interval data perturbations, run:
./run_all.sh perturb
This command perturbs each benchmark by selecting 1–5 observed data points. For each point, the value is either increased or decreased by a small amount.
The amount of change is computed as the standard deviation of all observed data points in the model, multiplied by a scaling factor called perturb_level
.
By default, perturb_level
is set to 0.01
in /AURA/python_models/driver.py
.
For each parameter, AURA saves the output files in the same format and location (under /AURA/python_models/result_cr/
) as described in before.
The filenames will include the suffix _perturb_0.01
(or the specified perturbation level), such as:
lightspeed_param_beta[0]_cuda:0_200_iter_100_perturb_0.01.csv
lightspeed.result_cuda:0_200_iter_100_perturb
These files contain the posterior bounds under perturbed data and can be visualized using the Mathematica notebook example-1.nb
.
To add a new benchmark model (e.g. <model_name>
), follow these steps:
Create a directory:
benchmarks/stan_bench/<model_name>/
Add the model file as <model_name>.stan
and the data file as <model_name>.data.R
.
See benchmarks/stan_bench/lightspeed/
for an example.
Edit the benchmark list in:
src/main/java/aqua/PytorchCompilerMain.java
Add "<model_name>"
to the list.
Then compile:
mvn package -DskipTests=true
java -cp "target/aqua-1.0.jar:lib/storm-1.0.jar" aqua.PytorchCompilerMain
This will generate python_models/<model_name>.py
.
Run the model with (change <model_name>
):
cd python_models
python3 -c "import <model_name> as original; exec(open('driver.py').read())"
To adjust device, splits, or enable data perturbation, refer to python_models/run_all.sh
.