An Efficient and Ergonomic Python Binding Library for BLAT
When conducting extensive queries, using the blat
of BLAT
suite can prove to be quite inefficient, especially if these operations aren't grouped. The tasks are allocated sporadically, often interspersed among other tasks.
In general, the choice narrows down to either utilizing blat
or combining gfServer
with gfClient
.
Indeed, blat
is a program that launches gfServer
, conducts the sequence query via gfClient
, and then proceeds to terminate the server.
This approach is far from ideal when performing numerous queries that aren't grouped since blat
repeatedly initializes and shuts down gfServer
for each query, resulting in substantial overhead.
This overhead consists of the time required for the server to index the reference, contingent on the reference's size.
To index the human genome (hg38), for example, would take approximately five minutes.
A more efficient solution would involve initializing gfServer
once and invoking gfClient
multiple times for the queries.
However, gfServer
and gfClient
are only accessible via the command line.
This necessitates managing system calls (for instance, subprocess
or os.system
), intermediate temporary files, and format conversion, further diminishing performance.
That is why PxBLAT
holds its position.
It resolves the issues mentioned above while introducing handy features like port retry
, use current running server
, etc.
-
Zero System Calls: Avoids system calls, leading to a smoother, quicker operation.
-
Ergonomics: With an ergonomic design,
PxBLAT
aims for a seamless user experience.
-
No External Dependencies:
PxBLAT
operates independently without any external dependencies.
-
Self-Monitoring: No need to trawl through log files;
PxBLAT
monitors its status internally.
-
Robust Validation: Extensively tested to ensure reliable performance and superior stability as BLAT.
-
Format-Agnostic:
PxBLAT
doesn't require you to worry about file formats.
-
In-Memory Processing:
PxBLAT
discards the need for intermediate files by doing all its operations in memory, ensuring speed and efficiency.
PxBLAT is scientific software, with a published paper in the BioRxiv. Check the published to read the paper.
@article {Li2023pxblat,
author = {Yangyang Li and Rendong Yang},
title = {PxBLAT: An Ergonomic and Efficient Python Binding Library for BLAT},
elocation-id = {2023.08.02.551686},
year = {2023},
doi = {10.1101/2023.08.02.551686},
publisher = {Cold Spring Harbor Laboratory},
url = {https://www.biorxiv.org/content/10.1101/2023.08.02.551686v2},
journal = {bioRxiv}
}
Welcome to PxBLAT! To kickstart your journey and get the most out of this tool, we have prepared a comprehensive documentation. Inside, you’ll find detailed guides, examples, and all the necessary information to help you navigate and utilize PxBLAT effectively.
If you encounter any issues or if something is not clear in the documentation, do not hesitate to open an issue. We are here to help and appreciate your feedback for improving PxBLAT.
If PxBLAT has been beneficial to your projects or you appreciate the work put into it, consider leaving a ⭐️ Star on our GitHub repository. Your support means the world to us and motivates us to continue enhancing PxBLAT.
Let’s embark on this journey together and make the most out of PxBLAT! 🎉 Please see the document for details and more examples.
Contributions are always welcome! Please follow these steps:
- Fork the project repository. This creates a copy of the project on your account that you can modify without affecting the original project.
- Clone the forked repository to your local machine using a Git client like Git or GitHub Desktop.
- Create a new branch with a descriptive name (e.g.,
new-feature-branch
orbugfix-issue-123
).
git checkout -b new-feature-branch
- Take changes to the project's codebase.
- Install the latest package
poetry install
- Test your changes
pytest -vlsx tests
- Commit your changes to your local branch with a clear commit message that explains the changes you've made.
git commit -m 'Implemented new feature.'
- Push your changes to your forked repository on GitHub using the following command
git push origin new-feature-branch
Create a pull request to the original repository. Open a new pull request to the original project repository. In the pull request, describe the changes you've made and why they're necessary. The project maintainers will review your changes and provide feedback or merge them into the main branch.
PxBLAT is modified from blat, the license is the same as blat. The source code and executables are freely available for academic, nonprofit, and personal use. Commercial licensing information is available on the Kent Informatics website (https://kentinformatics.com/).
yangliz5 🚧 |
Joshua Zhuang 🚇 |