csv2sqllike

Python functions for data analysis using python native container. Load data from csv files and deal with data like sql.


License
BSD-3-Clause
Install
pip install csv2sqllike==1.6.3

Documentation

csv2sqlLike

csv2sqlLike is a package for simple data analysis using light data set(<30MB). This package has filtering method similar with sql's filtering functions. Hope this package could be helpful for who analyze data in social science.

csv2sqlLike is consistent with 2 main classes.

  1. PseudoSQLFromCSV
  2. Transfer2SQLDB

PseudoSQLFromCSV is charging on handling data:

  • load data and heads as nested list and list from csv file
  • filtering data under specific condition
  • grouping data with specific head
  • write csv file with data inside this object

Transfer2SQLDB is charging on data transferring between PseudoSQLFromCSV and DB:

  • create table in DB from data inside PseudoSQLFromCSV
  • bring data as nested list from table in DB

Installation

PIP:

pip3 install csv2sqllike

Usages

load data from csv file

data = csv2sqllike.get_data_from_csv("[path_to_file]")
# example
data = csv2sqllike.get_data_from_csv("./data.csv")
data = csv2sqllike.get_data_from_csv("./test.csv", 
                                     type_dict={'region': 'str', 'country': 'str', 'name': 'str', 'sex': 'str', 'university': 'str', 'age': 'int'}
                                    )
# check loaded data type : nested list
print(type(data.data)) 
# check data head
print(data.head)
# check first row data
print(data.data[0])
# check data
print(data.data)

filtering data using condition

data.where("[head] [operator] [specific value]")
# example
data.where("region == east-asia")
# check conditions used
print(data.condition_where) 
# check filtered data
print(data.cache_data)

grouping data using specific head

data.group_by("[head]")
# example
data.group_by("region")
# check heads used for grouping data
print(data.condition_group_by)
# check grouping data stored in dictionary
print(data.cache_dict)

clear cache storage(storage for filtering and grouping)

# check cache stroage befor clearing caches
print(data.condition_where)
print(data.cache_data)
print(data.condition_group_by)
print(data.cache_dict)
# clear cache storage
data.clear_cache_data()
# check cache stroage after clearing caches
print(data.condition_where)
print(data.cache_data)
print(data.condition_group_by)
print(data.cache_dict)

add head and delete head

print(data.header)
# add new head
data.add_head("new_head")
# check added head
print(data.header)
# delete head
data.delete_head("new_head")
# check heads after deleting specific head
print(data.header)

For more examples and usage, please refer to the jupyter notebook.

Release History

  • 1.0.0
    • First release
  • 1.0.1
    • Add encoding option(default encode is utf-8)
  • 1.0.2
    • Add auto installing for required package
  • 1.0.3
    • Improve precision on data shape check function