Kedro starters by GetInData
In GetInData we deploy Kedro-based projects to different environments (on-premise and cloud). This repository hosts the starters with a few deployment recipes, including the ones that use our plugins.
Available starters
- pyspark-iris-running-on-gke uses Google Kubernetes Engine to run Spark-powered kedro pipeline in a distributed manner.
- pyspark-iris-running-on-gcp-dataproc-serverless uses Google Cloud Dataproc Batches to run Spark-powered kedro pipeline in a distributed manner on Severless Spark.
Starters development
- Clone the repository and switch to
develop
- Run
poetry install
- Run
source $(poetry env info --path)/bin/activate
Note: when you useconda
, you need the extra step ofconda deactivate
to avoid conflict between theconda
andvenv
- Install kedro
pip install kedro==0.18.3
- Run
kedro new -s <name-of-the-starter>