grai_source_cube


Keywords
data, data-lineage, data-science, dataengineering, datalineage, dbt, django, fivetran, hacktoberfest, mssql, mysql, open-source, parquet, postgresql, python, redshift, snowflake
License
MulanPSL-2.0
Install
pip install grai_source_cube==0.0.1

Documentation


Join Grai on Slack Open Issues Python fraction of codebase Supported python versions Launch YC: ✨ Grai - Open-source version control for metadata

Introduction

Data lineage made simple. Grai makes it easy to understand and test how your data relates across databases, warehouses, APIs, and dashboards.

  • Pre-built connectors. Automatically synchronize lineage from across the stack so your metadata is never out of date.
  • Centralized data tests. Write data validation tests that run whenever upstream data sources change.
  • Integrated with GitHub. Run data validation tasks as part of your CI/CD process to test changes everywhere your data is used.
  • Your data, your cloud. is fully open source and self-hosted. You maintain full control over your data and hosting environment.

How it works

  • Automatically build column level lineage spanning your warehouse and production services with connectors for dbt, Snowflake, Fivetran, and more (see below).
  • Get alerts in your CI/CD workflows whenever changes to a production system will impact your warehouse or dbt projects with GitHub Actions.
  • Self-host the project or run it in the Grai Cloud for free.

Connectors

integration install
“” Snowflake pip install grai-source-snowflake
“” BigQuery pip install grai-source-bigquery
“” Redshift pip install grai-source-redshift
“” Postgres pip install grai-source-postgres
“” MySQL pip install grai-source-mysql
“” SQL Server pip install grai-source-mssql
“” dbt pip install grai-source-dbt
“” Fivetran pip install grai-source-fivetran
“” csv, parquet, feather pip install grai-source-flat-file
“” Metabase pip install grai-source-metabase
“” Looker (alpha) pip install grai-source-looker

Quickstart

You can find a full quickstart guide in the documentation which covers deploying your own instance of Grai and getting set up with your first connector in Python. The fastest way to get started is through the Grai CLI but you can also run the project locally with docker compose.

Default login credentials:

username: null@grai.io
password: super_secret

CLI

pip install grai-cli
grai demo start

Running Locally

You can always find pre-built images of the backend server at ghcr.io/grai-io/grai-core/grai-server:latest and the frontend at ghcr.io/grai-io/grai-core/grai-frontend:latest, however, if you prefer to build from source, you can do so with docker compose.

git clone https://github.com/grai-io/grai-core
cp examples/deployment/docker-compose/docker-compose.yml
docker compose up

The backend server will be available at http://localhost:8000/ and the frontend is now here http://localhost:3000/.

After logging in and connecting a data source you'll be greeted with a lineage graph looking something like this

Frontend

For more information about using the web application check out the getting started guide.

Other Deployment Mechanisms

You can find example configurations for docker compose and Kubernetes in the examples folder.

Helm

We also publish a set of Helm charts which are available if you prefer.

helm repo add grai https://charts.grai.io
helm install grai grai/grai

Component Services

  • grai-server: The backend metadata service built on Postgres and Django as the Metadata persistence layer.
  • grai-frontend: The frontend web application built on React.
  • grai-cli: Python client library for interacting with the Grai server.
  • grai-schemas: The python metadata schema implementation library of Grai. It provides a standardized view of all Grai objects used to ensure compatibility between the server, integrations, and the client.
  • grai-graph: A python utility library for working with the Grai metadata graph.
  • grai-actions: A library of GitHub Actions implementations to integrate Grai tests into your CI/CD pipelines.
  • integrations: A collection of integration libraries to extract metadata and persist their results to Grai.

Community Roadmap

Community Feedback drives our roadmap. Please let us know what you'd like to see next by asking questions and upvoting feature requests!

Repo Activity

Repo activity

Community

Email us: founders@grai.io

Join us on Slack: Join Grai on Slack

Check us out at www.grai.io

Sign up for our Newsletter Grai Matters email list.