GraphATC: advancing multilevel and multi-label anatomical therapeutic chemical classification via atom-level graph learning
Wengyu Zhang, Qi Tian, Yi Cao, Wenqi Fan, Dongmei Jiang, Yaowei Wang, Qing Li and Xiao-Yong Wei.
Paper PDF | Paper Website | Demo Website
Official implementation of 'GraphATC: advancing multilevel and multi-label anatomical therapeutic chemical classification via atom-level graph learning', published in 'Briefings in Bioinformatics, Volume 26, Issue 2, March 2025' on 26 April 2025, https://doi.org/10.1093/bib/bbaf194
The accurate categorization of compounds within the anatomical therapeutic chemical (ATC) system is fundamental for drug development and fundamental research. Although this area has garnered significant research focus for over a decade, the majority of prior studies have concentrated solely on the Level 1 labels defined by the World Health Organization (WHO), neglecting the labels of the remaining four levels. This narrow focus fails to address the true nature of the task as a multilevel, multi-label classification challenge. Moreover, existing benchmarks like Chen-2012 and ATC-SMILES have become outdated, lacking the incorporation of new drugs or updated properties of existing ones that have emerged in recent years and have been integrated into the WHO ATC system. To tackle these shortcomings, we present a comprehensive approach, GraphATC.
Our contributions:
- We have constructed the most extensive ATC dataset to date.
- We implement the multilevel, multi-label study by extending the task to Level-2 (i.e. L2).
- We build more accurate representations for polymers.
- We optimize the representation learning for macromolecular drugs.
- We build a more effective framework for aggregating component representations of multicomponent drugs.
Table of contents:
- [2025.5.08] The demo website of GraphATC has been released.
- [2025.5.01] The code and dataset of GraphATC has been released.
- [2025.4.26] Our paper has been published in Briefings in Bioinformatics.
- [2025.4.07] Our paper has been accepted by Briefings in Bioinformatics.
- Clone the repository from GitHub.
git clone https://github.com/lookwei/GraphATC.git
cd GraphATC
- Create conda environment.
conda create -n graphatc python=3.10
conda activate graphatc
- Install packages.
conda install pytorch==2.1.1 pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c dglteam/label/cu118 dgl
pip install -r requirements.txt
- Our constructed dataset
ATC-GRAPH
is available in graphatc/dataset directory. - To load the dataset, please refer to the graphatc/dataset/uni_dataset.py file.
Comparison of ATC Benchmark Datasets:
Dataset | Chen-2012 [1] | ATC-SMILES [2] | ATC-GRAPH (Ours) | |
---|---|---|---|---|
Group by | Year | 2012 | 2022 | 2024 |
Polymer | Non-Poly | 3852 | 4545 | 5259 |
Polymer | 23 | 0 | 52 | |
Mass | Small | 3715 | 4353 | 4822 |
Macro | 160 | 192 | 489 | |
#Comp | Single | 2275 | 2685 | 2931 |
Multiple | 1600 | 1860 | 2380 | |
Total | 3883 | 4545 | 5311 | |
Coverage | 67.84% | 79.40% | 92.78% |
- GraphATC on ATC-GRAPH Level 1
bash scripts/train/train_GraphATC_L1.sh
- GraphATC on ATC-GRAPH Level 2
bash scripts/train/train_GraphATC_L2.sh
The training log will be saved in the graphatc/log/
directory.
- GraphATC on ATC-GRAPH Level 1
bash scripts/eval/eval_GraphATC_L1.sh
- GraphATC on ATC-GRAPH Level 2
bash scripts/eval/eval_GraphATC_L2.sh
The evaluation results will be saved in the graphatc/log/
directory.
- Create GraphATC repository;
- Add brief introduction of the GraphATC;
- Release the dataset;
- Release the source code;
- Release the web server;
If you find the repository or the paper useful, please use the following entry for citation.
Wengyu Zhang, Qi Tian, Yi Cao, Wenqi Fan, Dongmei Jiang, Yaowei Wang, Qing Li, Xiao-Yong Wei, GraphATC: advancing multilevel and multi-label anatomical therapeutic chemical classification via atom-level graph learning, Briefings in Bioinformatics, Volume 26, Issue 2, March 2025, bbaf194, https://doi.org/10.1093/bib/bbaf194
@article{zhang2025graphatc,
title={GraphATC: advancing multilevel and multi-label anatomical therapeutic chemical classification via atom-level graph learning},
author={Zhang, Wengyu and Tian, Qi and Cao, Yi and Fan, Wenqi and Jiang, Dongmei and Wang, Yaowei and Li, Qing and Wei, Xiao-Yong},
journal={Briefings in Bioinformatics},
volume={26},
number={2},
pages={bbaf194},
year={2025},
publisher={Oxford University Press}
}
[1] Chen L, Zeng WM, Cai YD, Feng KY, Chou KC. Predicting Anatomical Therapeutic Chemical (ATC) classification of drugs by integrating chemical-chemical interactions and similarities. PLoS One. 2012;7(4):e35254.
[2] Yi Cao, Zhen-Qun Yang, Xu-Lu Zhang, Wenqi Fan, Yaowei Wang, Jiajun Shen, Dong-Qing Wei, Qing Li, and Xiao-Yong Wei. Identifying The Kind Behind SMILES – Anatomical Therapeutic Chemical Classification using Structure-Only Representations, Briefings in Bioinformatics, 2022, DOI:10.1093/bib/bbac346.