submitTaRGET

Bulk upload data to target dcc


License
MIT
Install
pip install submitTaRGET==1.0.1

Documentation

TargetBulkUpload

Bulk upload of metadata to TaRGET II DCC

Install python3

Prepare library (if necessary)

pip3 install xlrd

Description

This script can be used to upload metadata to the TaRGET II DCC metadata database from a specific Excel template. You can use it to:

  1. Upload new metedata to the database;
  2. Update existing records in the database;
  3. Establish relationships between metadata records.

How to use it

Download the script alone with latest excel template

git clone https://github.com/xzhuo/TargetBulkUpload.git

Or You can click the "clone or download" button.

Obtain your personal API key

If you want to upload new data to the metadata database

  1. Fill in the Excel template accordingly. You must use the template in the repo. Don't rename the template; if you have to rename it, keep the version number intact in the name.
  • Leave the "system accession" column blank for the records you want to upload. If a system accession is found in a row, the data in the row will not be uploaded.
  • To eliminate possible duplications in the metadata database, now we require user to provide a unique "user accession" for each record in the database. You can download all you submitted "user accession" from our website. If a "user accession" in the excel file already exists in the metadata database, a warning will be printed and that row will be skipped during upload. "User accession" can also be used to establish relationships with other records in the same Excel file. Please fill in "user accession" according to our accession rules (see "Instructions" tab and individual tab headers).
  • All the dates in the Excel can be a date type in Excel format or a string in format "YYYY-MM-DD". Don't worry if Excel changes the date format automatically (it means Excel knows it is a date).
  • The relationship columns are labeled with a different color on the right side in each sheet. It should be either a "user accession" in the same Excel file or an existing "system accession" in the database.
    • For example, if you have a mouse record in the excel with a system accession "TRGTMSE0001" and a user accession "USRMSE0001", the mouse record itself will not be uploaded. However, If you want to submit a biosample extracted from the mouse, either "TRGTMSE0001" or "USRMSE0001" works if you want to link that biosample record to this mouse.
  1. Run it with following command to verify excel file:
python3 submission.py -k <API key> -x <excel file>
  1. If there is no error during the test run, you can upload the same Excel file to the production database with the following command (please don't use the following command and contact us if there is any unexpected warning or error):
python3 submission.py -k <API key> -x <excel file> --notest

If you want to update existing records in the metadata database

  1. Fill in the Excel template accordingly. You must use the template in the repo. Don't rename the template; if you have to rename it, keep the version number intact in the name.
  • To update data in the database, fill in all the applicable columns. You can use either "system accession" generated by our database or your "user accession" to indicate which record you want to update. You can also use both accessions for a row as long as they point to a same record in the metadata database.
  • All the dates in the Excel can be a date type in Excel format or a string in format "YYYY-MM-DD". Don't worry if Excel changes the date format automatically (it means Excel knows it is a date).
  • The relationship columns are labeled with a different color on the right side in each sheet. Both "system accession" and "user accession" can be used to establish relationships.
  1. Run it with following command to update existing records in metadata database (note there is no verification during update):
python3 submission.py -k <API key> -x <excel file> --update