sadic

Reimplementation as a python package of the software for Simple Atom Depth Index Calculator (SADIC)


Keywords
protein, atom, depth
License
MIT
Install
pip install sadic==2.0.0

Documentation

SADIC v2: A Modern Implementation of the Simple Atom Depth Index Calculator

License PyPI version Downloads

This repository contains the source code for the SADIC v2 package, a modern implementation of the Simple Atom Depth Index Calculator, used to compute the SADIC depth index, a measure of atom depth in protein molecules.

The package is designed to be easy to use and to provide a fast and efficient implementation of the algorithm.

It is built to be used as a command line tool or as a Python library. The package exposes functions to all the single steps of the algorithm, allowing the user to have full control over the computation, as well as a high-level function to compute the SADIC indices with a single line of python code.

Authors

  • Giacomo Nunziati
  • Alessia Lucia Prete
  • Sara Marziali

Table of Contents

Install

Install from PyPI

  pip install sadic

Install from source

  git clone https://github.com/nunziati/sadic.git
  cd sadic
  pip install .

Requirements

SADIC v2 requires the packages numpy, scipy, biopandas and biopython.

The requirements are automatically downloaded while installing this package.

To in stall the requirements separately, run the following command:

  pip install -U -r requirements.txt

Usage

The algorithm processes a protein structure and computes the depth index for each atom in the structure.
The protein structure can be provided as a PDB code or as a path to a PDB file. The package is integrated with BioPython and BioPandas, so the input can also be provided as a BioPython Structure object or a BioPandas PDB Entity object.

Command Line Interface (CLI)

Simplified interface for the command line usage of the package.
The CLI interface only allows to specify the input as a PDB code or a path to a PDB file. The output is returned as a PDB file.

  sadic <input> --output <output> [--config <config_file>]

Input can be:

  • a PDB code of a protein structure
  • a path to a PDB file (.pdb or .tar.gz)

Output must be a path of a PDB file (.pdb or .tar.gz)

Config file is optional and, if specified, must be a path to a python file (.py) containing two dictionaries:

  • sadic_config: a dictionary containing the configuration parameters for the SADIC algorithm
  • output_config: a dictionary containing the configuration parameters for the output file

Python interface

Simple usage

  import sadic

  # Input protein
  pdb_code = "1GWD" 

  # Run the pipeline
  result = sadic.sadic(pdb_code)

  # (optional) Useful to retrieve the depth indices from the result object
  output = result.get_depth_index()

  # Save the output to a file
  result.save_pdb("1gwd_sadic.pdb")

Filter and aggregations

Note: filters, atom aggregations and model aggregations are optional and independent from each other.
They can be used in any combination.

  import sadic

  # Input protein
  pdb_code = "1GWD" 

  # Define the filter options
  # Only return the SADIC indices for the atoms composing the alanine and glycine residues
  filter_arg = {"residue_name": ["ALA", "GLY"]}

  # Define the atom aggregation options
  # Compute the depth index for each residue by averaging the depth indices of the atoms composing it
  group_by = "residue_number"
  aggregation_function = "mean"
  atom_aggregation_arg = (group_by, aggregation_function)

  # Define the model aggregation options
  # If the pdb file contains multiple models, they can be aggregated
  # In this case, the depth indices of corresponding atoms in different models are averaged
  model_aggregation_arg = "mean"

  # Run the pipeline
  # Filter by residue name
  result = sadic.sadic(pdb_code, filter_by = filter_arg)

  # (optional) Useful to retrieve the depth indices from the result object
  # Aggregate the depth indices of the atoms of the same residue
  output = result.get_depth_index(atom_aggregation = atom_aggregation_arg)

  # Save the output to a file
  # Aggregate the depth indices of the different models
  result.save_pdb("1gwd_sadic.pdb", model_aggregation = model_aggregation_arg)

Software

Our approach involves modeling each protein as a solid object composed of spheres centered on single atoms.
SADIC simulates the probing of the protein computing the largest sphere inscribed in its molecular structure.
Let $r$ be the radius of such sphere and $V_{r_{max}}$ its volume.
During the simulation, the reference sphere is iteratively centered on each atom $i$, and the exposed volume $V_{r,i}$ is calculated.
The evaluation of the atom depth index $D_{i,r}$ for the $i$-th atom is determined by the formula: $$ D_{i,r} = \frac{2V_{r,i}}{V_{r_{max}}} $$ The exposed volume $V_{r,i}$ indicates the volume of the portion of the reference sphere centered on the $i$-th atom that does not intersect the solid representation of the protein.

Main algorithm

The execution of the SADIC v2 algorithm is articulated in multiple stages:

  • Loading of protein data;
  • Creation of the structured PDB entity;
  • For each model found in the PDB file:
    • Creation of the continuous-space model of the protein under analysis;
    • Voxelization and definition of the discrete-space model approximating the protein solid;
    • Filling of the internal cavities of the protein;
    • Computation of the reference radius, that will be used for the depth index calculation;
    • Computation of the depth indexes for the atoms selected by the user.

Architecture

The software architecture of SADIC v2 is organized into distinct sub-packages:

  • pdb for organizing the data of the input protein and managing the result of the execution of the algorithms;
  • solid for modeling and manipulating the continuous-space and discrete-space solids representing the molecule;
  • algorithm where the core algorithms are defined

The main sadic package exposes an API with a single function for executing the depth index computation pipeline.


Functionalities

Different types of input are supported:

  • PDB code
  • PDB file (raw .pdb or compressed .tar.gz)
  • BioPython Structure object
  • BioPandas PDB Entity object

The user can specify different options:

  • Reference sphere radius
  • Van Der Waals radii for the atoms
  • Grid resolution for the discretization of the protein
  • Protein models to consider (in case of multiple models)
  • Atom filters, to select only a subset of atoms
  • Atom aggregations, to compute the depth index for groups of atoms
  • Model aggregations, to obtain a single depth index for each atom (in case of multiple models)

The output can be obtained in different forms:

  • Python list
  • Numpy array
  • Save to a .txt file
  • Save to a .npy file (NumPy)
  • PDB file (raw .pdb or compressed .tar.gz)

License

This project is MIT licensed.