A native python binding for blat suite


Keywords
bioinformatics, sequence, aligner, blat, alignment, c, cpp, pybind11, python, sequence-analysis
License
MIT-feh
Install
pip install pxblat==0.3.2

Documentation

logo PxBLAT social

An Efficient and Ergonomic Python Binding Library for BLAT

python c++ c pypi conda Linux macOS pyversion tests Codecov docs download condadownload precommit ruff release open-issue close-issue activity lastcommit opull all contributors

Why PxBLAT?

When conducting extensive queries, using the blat of BLAT suite can prove to be quite inefficient, especially if these operations aren't grouped. The tasks are allocated sporadically, often interspersed among other tasks. In general, the choice narrows down to either utilizing blat or combining gfServer with gfClient. Indeed, blat is a program that launches gfServer, conducts the sequence query via gfClient, and then proceeds to terminate the server.

This approach is far from ideal when performing numerous queries that aren't grouped since blat repeatedly initializes and shuts down gfServer for each query, resulting in substantial overhead. This overhead consists of the time required for the server to index the reference, contingent on the reference's size. To index the human genome (hg38), for example, would take approximately five minutes.

A more efficient solution would involve initializing gfServer once and invoking gfClient multiple times for the queries. However, gfServer and gfClient are only accessible via the command line. This necessitates managing system calls (for instance, subprocess or os.system), intermediate temporary files, and format conversion, further diminishing performance.

That is why PxBLAT holds its position. It resolves the issues mentioned above while introducing handy features like port retry, use current running server, etc.

📚 Table of Contents

🔮 Features

  • Zero System Calls: Avoids system calls, leading to a smoother, quicker operation.
  • Ergonomics: With an ergonomic design, PxBLAT aims for a seamless user experience.
  • No External Dependencies: PxBLAT operates independently without any external dependencies.
  • Self-Monitoring: No need to trawl through log files; PxBLAT monitors its status internally.
  • Robust Validation: Extensively tested to ensure reliable performance and superior stability as BLAT.
  • Format-Agnostic: PxBLAT doesn't require you to worry about file formats.
  • In-Memory Processing: PxBLAT discards the need for intermediate files by doing all its operations in memory, ensuring speed and efficiency.

📎 Citation

PxBLAT is scientific software, with a published paper in the BioRxiv. Check the published to read the paper.

@article {Li2023pxblat,
	author = {Yangyang Li and Rendong Yang},
	title = {PxBLAT: An Ergonomic and Efficient Python Binding Library for BLAT},
	elocation-id = {2023.08.02.551686},
	year = {2023},
	doi = {10.1101/2023.08.02.551686},
	publisher = {Cold Spring Harbor Laboratory},
	url = {https://www.biorxiv.org/content/10.1101/2023.08.02.551686v2},
	journal = {bioRxiv}
}

🚀 Getting Started

Welcome to PxBLAT! To kickstart your journey and get the most out of this tool, we have prepared a comprehensive documentation. Inside, you’ll find detailed guides, examples, and all the necessary information to help you navigate and utilize PxBLAT effectively.

Need Help or Found an Issue?

If you encounter any issues or if something is not clear in the documentation, do not hesitate to open an issue. We are here to help and appreciate your feedback for improving PxBLAT.

Show Your Support

If PxBLAT has been beneficial to your projects or you appreciate the work put into it, consider leaving a ⭐️ Star on our GitHub repository. Your support means the world to us and motivates us to continue enhancing PxBLAT.

Let’s embark on this journey together and make the most out of PxBLAT! 🎉 Please see the document for details and more examples.

🤝 Contributing

Contributions are always welcome! Please follow these steps:

  1. Fork the project repository. This creates a copy of the project on your account that you can modify without affecting the original project.
  2. Clone the forked repository to your local machine using a Git client like Git or GitHub Desktop.
  3. Create a new branch with a descriptive name (e.g., new-feature-branch or bugfix-issue-123).
git checkout -b new-feature-branch
  1. Take changes to the project's codebase.
  2. Install the latest package
poetry install
  1. Test your changes
pytest -vlsx tests
  1. Commit your changes to your local branch with a clear commit message that explains the changes you've made.
git commit -m 'Implemented new feature.'
  1. Push your changes to your forked repository on GitHub using the following command
git push origin new-feature-branch

Create a pull request to the original repository. Open a new pull request to the original project repository. In the pull request, describe the changes you've made and why they're necessary. The project maintainers will review your changes and provide feedback or merge them into the main branch.

🪪 License

PxBLAT is modified from blat, the license is the same as blat. The source code and executables are freely available for academic, nonprofit, and personal use. Commercial licensing information is available on the Kent Informatics website (https://kentinformatics.com/).

Contributors

yangliz5
yangliz5

🚧
Joshua Zhuang
Joshua Zhuang

🚇

🙏 Acknowledgments


Star History Chart