gitlake

Git-Managed Distributed Data Lake Framework


License
MIT
Install
pip install gitlake==0.0.7

Documentation

logo

GitLake

GitLake is a distributed data lake management framework based on Git. It defines a file system that is optimized to perform ETLM tasks within a data lake environment. It also provides a CLI tool gitlake which offers user a git-like experience to manage and share raw data files and perform massively parallel compute tasks.