Data Alfred is a tool designed to streamline the setup of new data projects by creating essential folders and files structure in seconds. Aimed at data teams and data analysis projects, Data Alfred establishes a solid and standardized foundation, allowing data scientists and analysts to focus on what truly matters: extracting valuable insights from data. Inpired by https://drivendata.github.io/cookiecutter-data-science/
- Creates directories for raw, preprocessed, and mischaracterized data.
- Initializes directories for documentation, machine learning models, Jupyter notebooks, frontend for interactive visualizations, and references.
- Prepares a source code directory structure for visualization, data manipulation, feature engineering, models, and testing.
- Generates initial files, including
.env
for environment variables,.gitignore
,README.md
,requirements.txt
for dependencies,setup.py
for package installation,test_environment.py
for testing, and aDockerfile
for containerization.
To use Data Alfred, you will need to have Python installed on your system. The tool has been developed and tested in environments supporting Python 3.6 or newer.
pip install data-alfred
import data_alfred
data_alfred.create_project_structure()
- Data Alfred will take care of the rest, creating the necessary directory and file structure for your data project.
-
data/
: Subdivided intopreprocessed
,raw
, andmischaracterized
for different stages of data handling. -
docs/
: Containsmkdocs.md
andconfig.yml
for project documentation. -
models/
: Intended to store trained machine learning models. -
notebooks/
: For Jupyter notebooks of data analysis and exploration. -
frontend/
: Includes__init__.py
andstreamlit_app.py
for development of data visualization applications. -
references/
: To store project references and resources. -
reports/
: Intended for data analysis reports and visualizations. -
src/
: Contains subdirectories forvisualization
,data
,features
,models
, andtests
, along with an__init__.py
to treat the contents as a Python package.
Contributions to Data Alfred are welcome! If you have a suggestion to improve this tool, feel free to open an issue or pull request on the project repository. Let's work together to make starting data projects a quick and effortless task!
Distributed under the MIT License. See LICENSE
for more information.
Alestan Alves - https://github.com/alestanalves
Project Link: https://github.com/TOTVS-Privacidade-de-Dados/data-alfred