similarweb-airflow

Airflow is a system to programmatically author, schedule and monitor data pipelines.


Keywords
orchestration, airflow
License
BSD-2-Clause-FreeBSD
Install
puppet module install similarweb-airflow --version 0.1.2

Documentation

Airflow

Table of Contents

  1. Overview
  2. Module Description - What the module does and why it is useful
  3. Setup - The basics of getting started with airflow
  4. Usage - Configuration options and additional functionality
  5. Reference - An under-the-hood peek at what the module is doing and how
  6. Development - Guide for contributing to the module

Overview

This module manages airflow by Airbnb.

Module Description

The airflow module sets up and configures airflow.

This module has been tested against airflow versions: 1.5.2, 1.6.2

Setup

Limitations

This module does not initialize the airflow database schema - you can do so by executing:

airflow initdb

More info here.

The module has been tested on CentOS 7

The module manages the following

  • Airflow package.
  • Airflow configuration file.
  • Airflow services.
  • Airflow templates.

Important Note

Please refer to airflow installation before using this module.

Setup Requirements

airflow module depends on the following puppet modules:

  • puppetlabs-stdlib >= 1.0.0
  • stankevich-python >= 1.9.8
  • camptocamp-systemd >= 0.2.2

Beginning with airflow

Install this module via any of these approaches:

Usage

Main class

Install airflow 1.6.2 to /usr/local/airflow

class { 'airflow':
          version => '1.6.2',
          home_folder => '/usr/local/airflow'
      }

Install airflow, the work scheduler and the celery based worker

class { 'airflow': } ->
class { 'airflow::service::scheduler:' }
class { 'airflow::service::worker:' }

Hiera Support

  • Example: Defining ldap authentication and mesos settings in hiera.
airflow::ldap_settings:
  ldap_url: ldap:://<your.ldap.server>:<port>
  user_filter: objectClass=*
  user_name_attr: uid
  bind_user: cn=Manager,dc=example,dc=com
  bind_password: insecure
  basedn: dc=example,dc=com


airflow::mesos_settings:
  master: localhost:5050
  framework_name: Airflow
  task_cpu: 1
  task_memory: 256
  checkpoint: false
  failover_timeout: 604800
  authenticate: false
  default_principal: admin
  default_secret: admin

Reference

Classes

Public classes

  • airflow - Installs and configures airflow.
  • airflow::service::worker - Handles airflow's worker service.
  • airflow::service::scheduler - Handles airflow's scheduler service.
  • airflow::service::webserver - Handles airflow's webserver service.
  • airflow::service::flower - Handles airflow's flower service.

Private classes

  • airflow::install - Installs airflow python package.
  • airflow::config - Configures airflow.

Contributing

  1. Fork the repository on Github
  2. Create a named feature branch (like add_component_x)
  3. Commit your changes.
  4. Submit a Pull Request using Github