kedroio

Extension for `kedro` datasets


Keywords
data, pipelines, kedro
License
MIT
Install
pip install kedroio==0.1.3

Documentation

kedroio

A module extending the datasets that come shipped with kedro

Code style: black pre-commit

Example usage

-- example.sql
select *
from "database"."table_name"
limit 5;
# conf/base/catalog.py
my_athena_dataset:
  type: kedroio.datasets.aws.athena.AthenaQueryDataSet
  filepath: data/01_raw/example.csv
  sql_filepath: example.sql
  bucket: example-bucket
  workgroup: primary
  subfolder: data
  region_name: eu-west-2
  read_result: true # read into pandas DataFrame
  overwrite: false # skip download if filepath exists

Testing

Start moto server for mocked AWS resources

moto_server

Run tests

pytest tests/