Skip to content

pfloos/QUESTDB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 QUESTDB: A Database of Highly-Accurate Excitation Energies

Funding License Last Update

GitHub Repo stars GitHub forks GitHub watchers

DOI


📚 Table of Contents


✨ Key Features

  • 🔬 High Accuracy:
    Data obtained using state-of-the-art methods (FCI, CC3, CCSDT, CCSDTQ, CC4, CASPT2/3, NEVPT2, etc.)

  • 🌍 Wide Chemical Coverage:
    Includes small molecules, radicals, charged species, and transition metal complexes.

  • 🎯 Challenging Excitations:
    Focus on double excitations and intramolecular charge-transfer (CT) states.

  • 🛠️ Continuously Updated:
    Regularly improved with new high-level calculations and critical assessments.

  • 📂 Easy-to-Use Format:
    Organized .xlsx spreadsheets and .json files for simple extraction and analysis.


🧪 Why Use QUESTDB?

QUESTDB supports researchers to:

  • Benchmark TD-DFT, wavefunction-based, and emerging excited-state methods.
  • Guide the development of new computational models.
  • Facilitate interpretation of experimental spectra and photochemistry.

Note: Our vision is to establish QUESTDB as a cornerstone resource for benchmarking and training the next generation of AI-driven models in excited-state science.


⚙️ Scripts for Subset Generation and Analysis

This repository includes Python scripts to help users generate representative "diet" subsets of QUEST excitation energies—for instance, sets of 50, 100, or 200 transitions that reproduce the statistical properties of the full database (e.g., MAE, MSE, and RMSE) across different computational methods and excitation categories (see the data/diet directory).

These tools are especially useful for benchmarking new methods quickly or for training machine learning models when computational cost is a limiting factor.

Main functionalities include:

  • ✅ Generation of optimized subsets matching the full dataset’s distribution across:
    • Spin states
    • Valence vs Rydberg states
    • Excitation types (e.g., nπ*, ππ*, etc.)
    • Molecule sizes or other custom filters
  • ✅ Support for flexible user-defined filters (e.g., only valence, only singlets, exclude genuine doubles)
  • ✅ Preservation of full metadata in output JSON files
  • ✅ Optional optimization of subset selection using a genetic algorithm with Bayesian hyperparameter tuning (via optuna)

📂 Repository Contents

This repository provides:

  • Molecular Structures
  • Vertical Excitation Energies
  • Oscillator Strengths
  • Many Other Properties

Data is structured in .xlsx and .json files for ease of use (see the data directory).

📌 See the accompanying paper:
The QUEST database of highly-accurate excitation energies
P.-F. Loos, M. Boggio-Pasqua, A. Blondel, F. Lipparini, and D. Jacquemin,
J. Chem. Theory Comput. (in press) DOI:10.1021/acs.jctc.5c00975

© Béatrice Lejeune (@bea_quarelle)

👥 Contributors

The QUESTDB project is maintained by a collaboration between:


📚 Main References

Review articles on the QUEST database:

Key QUESTDB publications:


📖 Other References


🔋 Extension to Charged Excitations

The QUEST database also contains charged excitations, mainly ionization potentials (IPs) at the moment. Here is the short description of the charged excited states included in QUEST (see the charged directory):


🗂️ Data Structure

  • Molecular Structures:
    .xyz or .TeX formats

  • Excitation Energies, Oscillator Strengths and Other Properties:
    .xls spreadsheets and .json files

  • Scripts to Convert and Analyze Data
    .py scripts to convert data from one format to another and analyze them.

  • Additional Metadata:
    (Planned for future releases)


💰 Funding

ERC Logo

This database is supported by the PTEROSOR project, funded by the European Research Council (ERC) under the EU Horizon 2020 research and innovation program (Grant Agreement No. 863481).


🧮 HPC resources

This work was performed using HPC resources from CALMIP (Toulouse, France) under allocations 2018-18005 through 2025-18005, as well as resources provided by GLiCID (Nantes, France).