Data Frames are widely used and useful structures for data wrangling. The querier
exposes a query language for Python pandas
Data Frames, inspired from SQL's relational databases querying logic.
Contents
Installation | Package description | Contributing | Tests | API Documentation | Dependencies | License
Installation
- From Pypi:
pip install querier
- From Github, for the development version:
pip install git+https://github.com/thierrymoudiki/querier.git
Package description
There are currently 9 types of operations available in the querier
, with no plan to extend that list much further (to maintain a relatively simple mental model). These verbs will look familiar to dplyr
users, but the implementation (I used numpy
, pandas
and SQLite3
) and functions' signatures are different:
-
concat
: concatenates 2 Data Frames, either horizontally or vertically -
delete
: deletes rows from a Data Frame based on given criteria -
drop
: drops columns from a Data Frame -
filtr
: filters rows of the Data Frame based on given criteria -
join
: joins 2 Data Frames based on given criteria (available for completeness of the interface, this operation is already straightforward in pandas) -
select
: selects columns from the Data Frame -
summarize
: obtains summaries of data based on grouping columns -
update
: updates a column, using an operation given by the user -
request
: for operations more complex than the previous 8 ones, makes it possible to use a SQL query on the Data Frame
The following notebooks present examples of use of the querier
:
concat
exampledelete
exampledrop
examplefiltr
examplejoin
exampleselect
examplesummarize
exampleupdate
examplerequest
example
Contributing
Your contributions are welcome, and valuable. Please, make sure to read the Code of Conduct first.
If you're not comfortable with Git/Version Control yet, please use this form.
In Pull Requests, let's strive to use black
for formatting:
pip install black
black --line-length=80 file_submitted_for_pr.py
Tests
TBD
API documentation
https://querier.readthedocs.io/en/latest/
Dependencies
- Numpy
- Pandas
- SQLite3
License
BSD 3-Clause © Thierry Moudiki, 2019.