QM40-dataset-for-ML

QM40 is a QMx type of dataset which includes 150K molecules optimized from B3LYP/6-31G(2df,p) level of theory in the Gaussian16 with QM parameters, optimized coordinates, Mulliken charges and Local vibrational mode parameters as a quantitative measurer of the bond strengths.


Keywords
QM40_dataset_for_ML
License
MIT
Install
pip install QM40-dataset-for-ML==0.0.1

Documentation

QM40_dataset_for_ML

image image

QM40 is a QMx type of dataset which includes 150K molecules optimized from B3LYP/6-31G(2df,p) level of theory in the Gaussian16 with QM parameters, optimized coordinates, Mulliken charges and Local vibrational mode parameters as a quantitative measurer of the bond strengths.

Features

  • Categorize smiles according to their heavy atom count.
  • Screen smiles with specific atoms.
  • Convert smiles to PDB and XYZ files.
  • Semi-empirical level of QM calculation (XTB).
  • Automated Gaussian16 input file generator.
  • Automated sbatch file generator for HPC.
  • Local vibrational mode (LmodA) calculations.
  • QM parameter, geometry, Mulliken charges, LmodA data extraction from Gaussian output files.
  • Extracted data converted into CSV files.

Installation

pip install QM40-dataset-for-ML

Dependencies

QM40_dataset_for_ML's Python dependencies are listed in its requirements.txt file.

Linux