Table of contents generated with markdown-toc
py_vor
This library is intended to ease the use basic operations for data retrieval or storage for BI. This library is still in development.
Main objects
QueryRunner
Object to run queries from SQL files and return them in a specified format.
args
-
engine
: Name of the DB engine to run queries on. Supported engines are:mysql
sqlserver
redshift
postgresql
-
host
: Host name of the DB to run queries on. -
port
: Port number of the DB to run queries on. -
username
: Username to connect to the DB with. -
password
: Password for the user to connect to the DB with. -
schema
: Default schema to run queries on. -
sql_file
: Location of the file that has the queries to run.
kwargs
-
autocommit=False
: Whether the cursor will have autocommit on or off. Can only beTrue
orFalse
. -
returns_rows=False
: Whether the query returns rows or not. UseFalse
for basically everything other thanSELECT
statements. -
text_to_replace=None
: List of strings to replace the strings in thereplace_text_with
kwarg. Must be the same length ofreplace_text_with
. -
replace_text_with=None
: List of strings to replace the strings in thetext_to_replace
kwarg. Must be the same length oftext_to_replace
. -
include_headers=True
: Ifreturn_as
isnested_list
,csv_file
ors3_upload
, this will determine if the results include the column names or not. -
return_as='nested_list'
: Determines the format for the results to be returned. Can be:-
nested_list
: List of lists, where each inner list is a row of the results. -
csv_file
: CSV file where each row is a row of the results. -
s3_upload
: Same as CSV but uploads the results file to S3 and the deletes the local file. -
dict
: Each row is a dictionary in a list inside of{'data': [output]}
. -
pd_dataframe
: Pandas standard data frame.
-
-
results_file=None
: Name and location of the file to put the results on. Only works whenreturn_as
iscsv_file
ors3_upload
. -
aws_access_key_id=None
: AWS Access Key ID. Can remain asNone
if AWS CLI has the keys configured or IAM role is set up in host. -
aws_secret_access_key=None
: AWS Secret Access Key. Can remain asNone
if AWS CLI has the keys configured or IAM role is set up in host. -
aws_region_name='us-east-1'
: Default AWS region name. -
aws_bucket_name=None
: Bucket to put the results on. Only works ifresturn_as
iss3_upload
. -
aws_file_path=''
: S3 path after bucket to put the file on. Does not include the actual file name, for that use theresults_file
kwarg.
Examples
Nested list without headers
>>> from py_vor.tools.get_secret import get_secret
>>> from py_vor.QueryRunner import QueryRunner
>>> secret = get_secret('aws_secret_name')
>>> runner = QueryRunner(secret['engine'], secret['host'], secret['port'], secret['username'], secret['password'],secret['dbname'], 'select_user.sql', returns_rows=True, include_headers=False)
>>> results = runner.execute_all()
>>> results
[['10001', '1953-09-02', 'Georgi', 'Facello', 'M', '1986-06-26']]
>>>
Nested list with headers
>>> from py_vor.tools.get_secret import get_secret
>>> from py_vor.QueryRunner import QueryRunner
>>> secret = get_secret('aws_secret_name')
>>> runner = QueryRunner(secret['engine'], secret['host'], secret['port'], secret['username'], secret['password'],secret['dbname'], 'select_user.sql', returns_rows=True)
>>> results = runner.execute_all()
>>> results
[['emp_no', 'birth_date', 'first_name', 'last_name', 'gender', 'hire_date'], ['10001', '1953-09-02', 'Georgi', 'Facello', 'M', '1986-06-26']]
>>>
Dict
>>> from py_vor.tools.get_secret import get_secret
>>> from py_vor.QueryRunner import QueryRunner
>>> secret = get_secret('aws_secret_name')
>>> runner = QueryRunner(secret['engine'], secret['host'], secret['port'], secret['username'], secret['password'],secret['dbname'], 'select_user.sql', returns_rows=True, return_as='dict')
>>> results = runner.execute_all()
>>> results
{'data': [{'emp_no': '10001', 'birth_date': '1953-09-02', 'first_name': 'Georgi', 'last_name': 'Facello', 'gender': 'M', 'hire_date': '1986-06-26'}]}
>>>
Pandas data frame
>>> from py_vor.tools.get_secret import get_secret
>>> from py_vor.QueryRunner import QueryRunner
>>> secret = get_secret('aws_secret_name')
>>> runner = QueryRunner(secret['engine'], secret['host'], secret['port'], secret['username'], secret['password'],secret['dbname'], 'select_user.sql', returns_rows=True, return_as='pd_dataframe')
>>> results = runner.execute_all()
>>> results
emp_no birth_date first_name last_name gender hire_date
0 10001 1953-09-02 Georgi Facello M 1986-06-26
>>>
Tools
get_secret
Gets secret from AWS Secrets Manager.
args
-
secret_name
: Name of the secret in AWS Secrets Manager.
kwargs
-
aws_access_key_id=None
: AWS Access Key ID. Can remain asNone
if AWS CLI has the keys configured or IAM role is set up in host. -
aws_secret_access_key=None
: AWS Secret Access Key. Can remain asNone
if AWS CLI has the keys configured or IAM role is set up in host. -
aws_region_name='us-east-1'
: Default AWS region name.
Examples
>>> from py_vor.tools.get_secret import get_secret
>>> secret = get_secret('aws_secret_name')
>>> secret
{'username': 'dbusername', 'password': 'pswd', 'engine': 'mysql', 'host': 'localhost', 'port': 3306, 'dbname': 'default_schema', 'dbInstanceIdentifier': 'instanceID'}
>>>
download_from_s3
Download a file from S3.
args
-
file_name
: Name of the file to download from S3. -
aws_bucket_name
: Name of the the bucket the file is located on. -
aws_file_path
: Path where the file is located
kwargs
-
aws_access_key_id=None
: AWS Access Key ID. Can remain asNone
if AWS CLI has the keys configured or IAM role is set up in host. -
aws_secret_access_key=None
: AWS Secret Access Key. Can remain asNone
if AWS CLI has the keys configured or IAM role is set up in host. -
aws_region_name='us-east-1'
: Default AWS region name.
Examples
Straight from bucket
>>> from py_vor.tools.download_from_s3 import download_from_s3
>>> bucket_name = 'bucket-name'
>>> download_from_s3('my_file.txt', bucket_name, '')
>>>
With additional path
>>> from py_vor.tools.download_from_s3 import download_from_s3
>>> bucket_name = 'bucket-name'
>>> download_from_s3('my_file.txt', bucket_name, 'subfolder1/subfolder2')
>>>
upload_to_s3
Upload a file to S3.
args
-
file_name
: Name of the file to download from S3. -
aws_bucket_name
: Name of the the bucket the file is located on. -
aws_file_path
: Path where the file is located
kwargs
-
aws_access_key_id=None
: AWS Access Key ID. Can remain asNone
if AWS CLI has the keys configured or IAM role is set up in host. -
aws_secret_access_key=None
: AWS Secret Access Key. Can remain asNone
if AWS CLI has the keys configured or IAM role is set up in host. -
aws_region_name='us-east-1'
: Default AWS region name.
Examples
Straight to bucket
>>> from py_vor.tools.upload_to_s3 import upload_to_s3
>>> bucket_name = 'bucket-name'
>>> upload_to_s3('my_file.txt', bucket_name, '')
>>>
With additional path
>>> from py_vor.tools.upload_to_s3 import upload_to_s3
>>> bucket_name = 'bucket-name'
>>> upload_to_s3('my_file.txt', bucket_name, 'subfolder1/subfolder2')
>>>
Uploaded to PyPI by using
python setup.py sdist bdist_wheel
twine upload dist/*