Digital Twins of Ex Vivo Human Lungs

This is the official repository for Digital Twins of Ex Vivo Human Lungs.

⚠️Important Notice🚧

Steamlit Server Issues (Updated on September 24, 2025)

Streamlit released a new version on September 24, 2025 at 5:30 PM (UTC−4).

⏳ The web app may be temporarily unavailable while the Streamlit server updates to a new version. Please wait a moment and try again shortly.

Background and Overview

✅ Ex vivo lung perfusion (EVLP) is a cutting-edge platform that maintains isolated human lungs in a physiologically active state outside the body, enabling comprehensive functional assessment and targeted therapeutic interventions

✅ The concept of a “digital twin” – a dynamic, high-fidelity computational replica of a physical system – is rapidly gaining traction in medicine for its ability to simulate complex biological processes in silico

✅ We’ve built a high-fidelity digital twin of ex vivo human lungs, powered by the world’s largest annotated ex vivo lung function dataset

✅ The DT model has been validated as a robust digital control for enhanced preclinical therapeutic evaluation

✅ Complete pipeline: training scripts, inference modules, Docker, Google Colab Notebook & interactive Web-based app

Ex vivo lung perfusion system

EVLPvideo.mp4

Getting Started

1. 🌐 Web-based App (Code-free Deployment + Tutorial)

The web-app offers an easy-to-follow, code-free user interface for seamless DT development that can be tailored to lung-specific conditions at the tip of your finger.
🚀 Launch DT Web-app

🎥 App Tutorial

We also provide a short tutorial to demonstrate how to use the Web-based App.

🔴Youtube: ▶️🎬 Watch the Tutorial

2. 📒 Google Colab Notebook

🚀 Launch in Colab contains the demo with pre-written code cells on how to build a digital twin using our demo data: no code edit or local environment setup required.

3. 🐳 Running with Docker

Prerequisites:

Docker must be installed and running on your machine.
All docker images are published on Docker Hub

This repository provides ready-to-use helper scripts to launch both the web application and the main.py in Docker containers.

For macOS/Linux users:

To grant execute permission to Docker start scripts, please run:

chmod +x docker_run_app.sh

OR

 chmod +x docker_run_main.sh

Run Docker container to host the web-based Streamlit app for a code-free deployment experience of the DT demo:

./docker_run_app.sh

Run Docker container to execute main.py to run the DT demo locally:

./docker_run_main.sh

For Windows users

Run Docker container to host the web-based Streamlit app for a code-free deployment experience of the DT demo:

Double-click docker_run_app.bat in File Explorer and click on the Local URL in the terminal to open the app in the browser.

Run Docker container to execute main.py to run the DT demo locally:

Double-click docker_run_main.bat in File Explorer and view DT results in work_dir/DT_Lung/Output folder.

4. 🐍 Python users

This DT model works with Python 3.12. Please make sure you have the correct version of Python installed before getting started. Check if you have the correct version installed using:
```
python --version
```

Clone this repository:

git clone https://github.com/Sage-Lab-ai/DT_Lung.git

Create a virtual environment (optional but recommended):

python -m venv env
source env/bin/activate  # On Windows use `env\Scripts\activate`

All system requirements are listed in the requirements.txt file. To set up the environment, please run:
```
pip install -r requirements.txt
```
Run the main.py script
```
python main.py
```
Note: Model weights and demo data are fetched automatically — no manual setup required.
Once completed, your digital twins using demo data have been built! ✅
DT results will be saved to work_dir/DT_Lung/Output. Please view results in the Output folder.
Alternatively, you can also host the Streamlit web-based app locally on your machine by running:
```
streamlit run app.py
```

DT Workflow

💡 Did you know? These models are trained on the largest dataset of its kind.

Code Structure

The digital-twin pipeline in this repository is implemented using two core machine learning architectures: gated recurrent units (GRU) and XGBoost (XGB). The GRU/ and XGB/ directories each contain scripts needed for model training, including data loading and preprocessing scripts, model architecture definitions and calibration, and utility functions that support the training pipeline.

All inference scripts are located in the inference/ folder, which provides code to load a trained model and generate predicted lung function parameters (inference). For detailed instructions on running inference and building your own digital twin, please see the Getting Started section above and choose the inference method (command-line, Docker, Google Colab, or web-based app) that best suits you.

project-root/
├── main.py                                  # Primary entry point for the DT_Lung pipeline
├── GRU                                      # Pipeline on GRU model training and calibration 
│   ├── __init__.py  
│   ├── scripts
│   │   ├── 20_fold_cv_all.sh                # 20-fold cross-validation for all breath setups and variables
│   │   ├── lock_all_models.sh               # Train and save models for all breath setups and variables
│   ├── util
│   │   ├── baseline.py                      # Baseline implementation (historical average and moving average)
│   │   ├── static_feats.py                  # Define the list of static features to include in GRU training
│   ├── EVLPMultivariateBreathDataset.py     # Custom dataset class for breath-by-breath time-series data with static features
│   ├── forecast_parameters.py               # Model training for single-stage breath setups (A1_A2, A1_A3, A1A2_A3)
│   ├── forecast_parameters_w_pred.py        # Performs model training for two-stage breath setups (A1PA2_A3)
│   ├── forecasting_pipeline.py              # Defines re-usable functions that are used in the training pipelines
│   ├── GRU.py                               # GRU model class that defines the model architecture
├── XGB                                      # Pipeline on XGBoost model training and calibration
│   ├── __init__.py
│   ├── BaselineModels.py                    # Baseline model definition
│   ├── Dataset.py                           # Dataset Class
│   ├── TemporalOrder.py                     # Helper class to parse temporal order in column names
│   ├── TabularForecasting.py                # Classes for training, evaluation, and output organization
│   ├── pipelines.py                         # Pipelines and training example            
│   ├── image_gridsearch_static.py           # Image static DT model hyperparameter grid search
│   ├── image_gridsearch_dynamic.py          # Image dynamic DT model hyperparameter grid search
│   ├── image_train_static.py                # Image static DT model training
│   ├── image_train_dynamic.py               # Image static DT model training
│   ├── utils.py                             # util functions
├── inference
│   ├── __init__.py
│   ├── GRU_inference.py                     # Pipeline for all GRU-based model inference
│   ├── XGB_inference.py                     # Pipeline for all XGB-based model inference
│   ├── reformat.py                          # For enhancement on readability for general users
│   ├── visualization.py                     # Helper functions for DT visualization
├── Dockerfile                               # Defines the container image
├── docker_build.sh                          # Build Docker image
├── docker_run_app.sh                        # Running Docker container for web-app on MacOS/Linux
├── docker_run_app.bat                       # Running Docker container for web-app on Windows
├── docker_run_main.sh                       # Running Docker container for main service on MacOS/Linux
├── docker_run_main.bat                      # Running Docker container for main service on Windows
├── requirements.txt                         # System requirements
├── GLOSSARY.md                              # For units, data range references for users, acronyms, and domain terms
├── docs                                     # For project page html document

Naming convention

This project follows a custom naming convention for model configurations and variable identifiers. Detailed explanations are provided below to enhance clarity and improve code readability.

GRU model training setups

A breath setup defines which breaths are included as input and output data to the model. They include:

A1_A2 Setups for Static Digital Lung Forecasting (Forecast 2^nd hour lung function using 1^st hour baseline data):

A1F50_A2F50
A1F50L50_A2F50
N1L20A1F50L50_A2F50

A1PA2_A3 Setups for Static Digital Lung Forecasting (Forecast 2^nd hour lung function using 1^st hour baseline and 2^nd hour predicted data):

A1F50PA2F50_A3F50
A1F50L50PA2F50_A3F50
N1L20A1F50L50PA2F50_A3F50

A1_A3 Setups for Static Digital Lung Forecasting (Forecast 3^rd hour lung function using 1^st hour baseline data):

A1F50_A3F50
A1F50L50_A3F50
N1L20A1F50L50_A3F50

A1A2_A3 Setups for Dynamic Digital Lung Forecasting (Forecast 3^rd hour lung function using 1^st and 2^nd hour observed data):

A1F50A2F50_A3F50
A1F50L50A2F50_A3F50
N1L20A1F50L50A2F50_A3F50

Legend: A = assessment period, N = normal breathing period, F = first breaths, L = last breaths , numbers = the number of breaths included

Note: everything before _ is the input variables and everything after _ is the target variable.

GRU model variables

The variable defines which parameter will be forecasted. They include:

Dynamic Compliance (Dy_Comp)
Peak Pressure (P_peak)
Mean Pressure (P_mean)
Expiratory Volume (Ex_vol)

XGBoost model training setups

H1_to_H2: Static digital lung forecasting (Forecast 2^nd hour lung function using 1^st hour baseline data)

H1_to_H3: Static digital lung forecasting (Forecast 3^rd hour lung function using 1^st hour baseline data)

H1_predH2_to_H3: Static digital lung forecasting (Forecast 3^rd hour lung function using 1^st hour baseline data and predicted 2^nd hour data)

H1_H2_to_H3: Dynamic digital lung forecasting (Forecast 3^rd hour lung function using 1^st and 2^nd hour obsereved data)

XGBoost model variables

Due to a large number of parameters, see GLOSSARY.md for all abbreviation definitions.

🤖 Inference (Create digital twins using your data📊🫁)

All trained models developed in this project are published on our HuggingFace Model Repository.

We also provide a Demo Dataset on HuggingFace for users to try out our digital twin models.

We provide 4 distinct methods in the Getting Started section to run DT inference for creating digital twins of human lungs using either our demo data or your own data!

🛠️ Troubleshooting Errors (Last Update: Aug 2025)

🌐 Web-app

Steamlit Server Issues

The web app may be temporarily unavailable while the Streamlit server updates to a new version. Please wait a moment and try again shortly.

HuggingFace Download Issues

If the model or demo data fails to download from Hugging Face on the first attempt (often due to high request volume).

Click "Optional: Redownload Models and Data" to try again. Alternatively, you can refresh the web page and retry.

🐳 Docker

Please remember to install Docker on your device and grant permission to Docker scripts (as described in Running with Docker).

📒 Colab

HuggingFace Download Issues

If the model or demo data fails to download from Hugging Face on the first attempt (often due to high request volume).

Please re-run the cell containing

%run main.py

OR

Restart the runtime and re-run the notebook.

📢 Stay up to date & 🐞 Report Issues

Interested in staying up-to-date with our work? Join our mailing list here: Mailing List

If you encounter any bugs or have ideas for improvement, please file an issue here: Open a new issue

License

This dataset is released under the Creative Commons Attribution‑NonCommercial‑ShareAlike 4.0 International (CC BY‑NC‑SA 4.0) license. Commercial use is prohibited.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Digital Twins of Ex Vivo Human Lungs

⚠️Important Notice🚧

Table of Contents

Background and Overview

Getting Started

1. 🌐 Web-based App (Code-free Deployment + Tutorial)

🎥 App Tutorial

2. 📒 Google Colab Notebook

3. 🐳 Running with Docker

Prerequisites:

For macOS/Linux users:

For Windows users

4. 🐍 Python users

DT Workflow

Code Structure

Naming convention

GRU model training setups

GRU model variables

XGBoost model training setups

XGBoost model variables

🤖 Inference (Create digital twins using your data📊🫁)

🛠️ Troubleshooting Errors (Last Update: Aug 2025)

🌐 Web-app

🐳 Docker

📒 Colab

📢 Stay up to date & 🐞 Report Issues

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
GRU		GRU
XGB		XGB
assets		assets
docs		docs
inference		inference
.gitignore		.gitignore
Dockerfile		Dockerfile
GLOSSARY.md		GLOSSARY.md
LICENSE		LICENSE
README.md		README.md
app.py		app.py
docker_build.sh		docker_build.sh
docker_run_app.bat		docker_run_app.bat
docker_run_app.sh		docker_run_app.sh
docker_run_main.bat		docker_run_main.bat
docker_run_main.sh		docker_run_main.sh
main.py		main.py
requirements.txt		requirements.txt

License

Sage-Lab-ai/DT_Lung

Folders and files

Latest commit

History

Repository files navigation

Digital Twins of Ex Vivo Human Lungs

⚠️Important Notice🚧

Table of Contents

Background and Overview

Getting Started

1. 🌐 Web-based App (Code-free Deployment + Tutorial)

🎥 App Tutorial

2. 📒 Google Colab Notebook

3. 🐳 Running with Docker

Prerequisites:

For macOS/Linux users:

For Windows users

4. 🐍 Python users

DT Workflow

Code Structure

Naming convention

GRU model training setups

GRU model variables

XGBoost model training setups

XGBoost model variables

🤖 Inference (Create digital twins using your data📊🫁)

🛠️ Troubleshooting Errors (Last Update: Aug 2025)

🌐 Web-app

🐳 Docker

📒 Colab

📢 Stay up to date & 🐞 Report Issues

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages