AutoDoc
This project was automatically generated using the LINCC-Frameworks python-project-template. For more information about the project template see the documentation.
Dev Guide - Getting Started
Before installing any dependencies or writing code, it's a great idea to create a
virtual environment. We recommend using conda
to manage virtual
environments. If you have conda installed locally, you can run the following to
create and activate a new environment.
>> conda create env -n <env_name> python=3.8
>> conda activate <env_name>
Once you have created a new environment, you can install this project for local development using the following commands:
>> pip install -e .'[dev,pipelines]'
>> pre-commit install
>> conda install pandoc
Notes:
- The single quotes around
'[dev]'
may not be required for your operating system. - Look at
pyproject.toml
for other optional dependencies, e.g. you can dopip install -e ."[dev,pipelines,cuda]"
if you want to use CUDA. -
pre-commit install
will initialize pre-commit for this local repository, so that a set of tests will be run prior to completing a local commit. For more information, see the Python Project Template documentation on pre-commit - Install
pandoc
allows you to verify that automatic rendering of Jupyter notebooks into documentation for ReadTheDocs works as expected. For more information, see the Python Project Template documentation on Sphinx and Python Notebooks
Running AzureML pipelines
This repo contains the evaluation and training pipelines for AutoDoc.
Prerequisites
Add the ML extension:
az extension add --name ml
Configure the CLI:
az login
az account set --subscription "<your subscription name>"
az configure --defaults workspace=<aml workspace> group=<resource group> location=<location, e.g. westus3>
Running jobs
Prediction
az ml job create -f azureml/eval.yml --set display_name="Test prediction job" --set environment_variables.HF_TOKEN=<your huggingface token> --web
Notes:
-
--name
will set the mlflow run id -
--display_name
becomes the name in the experiment dashboard -
--web
argument will pop-up a browser window for tracking the job. - The
HF_TOKEN
is required for gated repos, which need authentication
Uploading data
Example:
az storage blob upload --account-name <account> --container <container>> --file data/data.jsonl -n data/sweetpea/data.jsonl