sourced-engine

Engine to use Spark on top of source code repositories.


Keywords
charts, dashboards, data-analysis, data-mining, data-platform, data-visualization, git, github, metrics, sql
License
Apache-2.0
Install
pip install sourced-engine==1.0.0

Documentation

source{d} Community Edition (CE)

source{d} Community Edition (CE) is the data platform for your software development life cycle.

GitHub version Build Status Beta Go Report Card GoDoc

Website • Documentation • Blog • Slack • Twitter

source{d} CE dashboard

Introduction

source{d} Community Edition (CE) helps you to manage all your code and engineering data in one place:

  • Code Retrieval: Retrieve and store the git history of the code of your organization as a dataset.
  • Analysis in/for any Language: Automatically identify languages, parse source code, and extract the pieces that matter in a language-agnostic way.
  • History Analysis: Extract information from the evolution, commits, and metadata of your codebase and from GitHub, generating detailed reports and insights.
  • Familiar APIs: Analyze your code through powerful SQL queries. Use tools you're familiar with to create reports and dashboards.

This repository contains the code of source{d} Community Edition (CE) and its project documentation, which you can also see properly rendered at docs.sourced.tech/community-edition.

Contents

Quick Start

source{d} CE supports Linux, macOS, and Windows.

To run it you only need:

  1. To have Docker installed in your PC
  2. Download sourced binary (for your OS) from our releases
  3. Run it:
    $ sourced init orgs --token=<github_token> <github_org_name>
    And log in into http://127.0.0.1:8088 with login: admin, and password: admin.

If you want more details of each step, you will find in the Quick Start Guide all the steps to get started with source{d} CE, from the installation of its dependencies to running SQL queries to inspect git repositories.

If you want to know more about source{d} CE, in the next steps section you will find some useful resources for guiding your experience using this tool.

If you have any problem running source{d} CE you can take a look at our Frequently Asked Questions or Troubleshooting sections. You can also ask for help when using source{d} CE in our source{d} Forum. If you spotted a bug, or you have a feature request, please open an issue to let us know about it.

Architecture

For more details on the architecture of this project, read docs/learn-more/architecture.md.

source{d} CE is deployed as Docker containers, using Docker Compose.

This tool is a wrapper for Docker Compose to manage the compose files and its containers easily. Moreover, sourced does not require a local installation of Docker Compose, if it is not found it will be deployed inside a container.

The main entry point of source{d} CE is sourced-ui, the web interface from where you can access your data, create dashboards, run queries...

The data exposed by the web interface is prepared and processed by the following services:

  • babelfish: universal code parser.
  • gitcollector: fetches the git repositories owned by your organization.
  • ghsync: fetches metadata from GitHub (users, pull requests, issues...).
  • gitbase: SQL database interface to Git repositories.

Contributing

Contributions are welcome and very much appreciated 🙌 Please refer to our Contribution Guide for more details.

Community

source{d} has an amazing community of developers and contributors who are interested in Code As Data and/or Machine Learning on Code. Please join us! 👋

Code of Conduct

All activities under source{d} projects are governed by the source{d} code of conduct.

License

GPL v3.0, see LICENSE.