glueetl

A command line tool to help deploy AWS Glue Jobs at ease :)


License
MIT
Install
pip install glueetl==0.0.5

Documentation

glueetl

A command line tool to help deploy AWS Glue Jobs at ease :)

Install

$ pip install glueetl

How to develop a Glue job

You can develop a Glue job by following the steps below.

1. Set up AWS Credentials and Region

Before you can deploy a Glue job to AWS Glue, you must set up AWS Credentials and Region.

$ vim ~/.aws/credentials

[default]
aws_access_key_id=<AWS_ACCESS_KEY_ID>
aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>
region=<REGION>

2. Initialize a Glue job

$ mkdir sample
$ cd sample
$ glueetl init
.
├── README.md
├── config.yaml
└── script.py

config.yaml includes job properties and currently it supports the following properties:

job: 
  name: sample-glue-job
  role_name: AWSGlueServiceRole
  script_location: s3://glue-job-scripts/sample-glue-job/script.py
  max_concurrent_runs: 10
  command_name: glueetl
  max_retries: 0
  timeout: 28800
  max_capacity: 10
  connections:
    - first_connection
    - second_connection
  default_arguments:
    argument1: value1
    argument2: value2
  non_overridable_arguments:
    argument1: value1
    argument2: value2
  trigger:
    name: trigger-sample-glue-job
    schedule: cron(5 * * * ? *)
  tags:
    key1: value1
    key2: value2

Please change default values in file config.yaml and write your job logic in file script.py.

3. Deploy a Glue job

$ cd sample
$ glueetl deploy

Your job will be deployed to AWS Glue.

4. Run a Glue job

You can manually run your Glue job like this.

$ cd sample
$ glueetl run --arg1=value1 --arg2=value2