BigQuery-DatasetManager 0.1.6 on PyPI

https://travis-ci.org/laughingman7743/BigQuery-DatasetManager.svg?branch=master

BigQuery-DatasetManager

BigQuery-DatasetManager is a simple file-based CLI management tool for BigQuery Datasets.

Requirements

Python
- CPython 2,7, 3,4, 3.5, 3.6

Installation

$ pip install BigQuery-DatasetManager

Resource representation

The resource representation of the dataset and the table is described in YAML format.

Dataset

name: dataset1
friendly_name: null
description: null
default_table_expiration_ms: null
location: US
access_entries:
-   role: OWNER
    entity_type: specialGroup
    entity_id: projectOwners
-   role: WRITER
    entity_type: specialGroup
    entity_id: projectWriters
-   role: READER
    entity_type: specialGroup
    entity_id: projectReaders
-   role: OWNER
    entity_type: userByEmail
    entity_id: aaa@bbb.gserviceaccount.com
-   role: null
    entity_type: view
    entity_id:
        datasetId: view1
        projectId: project1
        tableId: table1
labels:
    foo: bar

Key name			Value	Description
dataset_id			str	ID of the dataset.
friendly_name			str	Title of the dataset.
description			str	Description of the dataset.
default_table_expiration_ms			int	Default expiration time for tables in the dataset.
location			str	Location in which the dataset is hosted.
access_entries			seq	Represents grant of an access role to an entity.
access_entries	role		str	Role granted to the entity. The following string values are supported: `OWNER` `WRITER` `READER` It may also be `null` if the `entity_type` is `view`.
	entity_type		str	Type of entity being granted the role. One of `userByEmail` `groupByEmail` `domain` `specialGroup` `view`
	entity_id		str/map	If the `entity_type` is not 'view', the `entity_id` is the `str` ID of the entity being granted the role. If the `entity_type` is 'view', the `entity_id` is a `dict` representing the view from a different dataset to grant access to.
		datasetId	str	ID of the dataset containing this table. (Specifies when `entity_type` is `view`.)
		projectId	str	ID of the project containing this table. (Specifies when `entity_type` is `view`.)
		tableId	str	ID of the table. (Specifies when `entity_type` is `view`.)
labels			map	Labels for the dataset.

NOTE: See the official documentation of BigQuery Datasets for details of key names.

Table

table_id: table1
friendly_name: null
description: null
expires: null
partitioning_type: null
view_use_legacy_sql: null
view_query: null
schema:
-   name: column1
    field_type: STRING
    mode: REQUIRED
    description: null
    fields: null
-   name: column2
    field_type: RECORD
    mode: NULLABLE
    description: null
    fields:
    -   name: column2_1
        field_type: STRING
        mode: NULLABLE
        description: null
        fields: null
    -   name: column2_2
        field_type: INTEGER
        mode: NULLABLE
        description: null
        fields: null
    -   name: column2_3
        field_type: RECORD
        mode: REPEATED
        description: null
        fields:
        -   name: column2_3_1
            field_type: BOOLEAN
            mode: NULLABLE
            description: null
            fields: null
labels:
    foo: bar

table_id: view1
friendly_name: null
description: null
expires: null
partitioning_type: null
view_use_legacy_sql: false
view_query: |
    select
    *
    from
    `project1.dataset1.table1`
schema: null
labels: null

Key name		Value	Description
table_id		str	ID of the table.
friendly_name		str	Title of the table.
description		str	Description of the table.
expires		str	Datetime at which the table will be deleted. (ISO8601 format `%Y-%m-%dT%H:%M:%S.%f%z`)
partitioning_type		str	Time partitioning of the table if it is partitioned. The only partitioning type that is currently supported is `DAY`.
view_use_legacy_sql		bool	Specifies whether to use BigQuery's legacy SQL for this view.
view_query		str	SQL query defining the table as a view.
schema		seq	The schema of the table destination for the row.
schema	name	str	The name of the field.
	field_type	str	The type of the field. One of `STRING` `BYTES` `INTEGER` `INT64` (same as INTEGER) `FLOAT` `FLOAT64` (same as FLOAT) `BOOLEAN` `BOOL` (same as BOOLEAN) `TIMESTAMP` `DATE` `TIME` `DATETIME` `RECORD` (where RECORD indicates that the field contains a nested schema) `STRUCT` (same as RECORD)
	mode	str	The mode of the field. One of `NULLABLE` `REQUIRED` `REPEATED`
	description	str	Description for the field.
	fields	seq	Describes the nested schema fields if the type property is set to `RECORD`.
labels		map	Labels for the table.

NOTE: See the official documentation of BigQuery Tables for details of key names.

Directory structure

.
├── dataset1        # Directory storing the table configuration file of dataset1.
│   ├── table1.yml  # Configuration file of table1 in dataset1.
│   └── table2.yml  # Configuration file of table2 in dataset1.
├── dataset1.yml    # Configuration file of dataset1.
├── dataset2        # Directory storing the table configuration file of dataset2.
│   └── .gitkeep    # When keeping a directory, dataset2 is empty.
├── dataset2.yml    # Configuration file of dataset2.
└── dataset3.yml    # Configuration file of dataset3. This dataset does not manage the table.

NOTE: If you do not want to manage the table, delete the directory with the same name as the dataset name.

Usage

Usage: bqdm [OPTIONS] COMMAND [ARGS]...

Options:
  -c, --credential-file PATH  Location of credential file for service accounts.
  -p, --project TEXT          Project ID for the project which you’d like to manage with.
  --color / --no-color        Enables output with coloring.
  --parallelism INTEGER       Limit the number of concurrent operation.
  --debug                     Debug output management.
  -h, --help                  Show this message and exit.

Commands:
  apply    Builds or changes datasets.
  destroy  Specify subcommand `plan` or `apply`
  export   Export existing datasets into file in YAML format.
  plan     Generate and show an execution plan.

Export

Usage: bqdm export [OPTIONS] [OUTPUT_DIR]

  Export existing datasets into file in YAML format.

Options:
  -d, --dataset TEXT          Specify the ID of the dataset to manage.
  -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
  -h, --help                  Show this message and exit.

Plan

Usage: bqdm plan [OPTIONS] [CONF_DIR]

  Generate and show an execution plan.

Options:
  --detailed_exitcode         Return a detailed exit code when the command exits.
                              When provided, this argument changes
                              the exit codes and their meanings to provide
                              more granular information about what the
                              resulting plan contains:
                              0 = Succeeded with empty diff
                              1 = Error
                              2 = Succeeded with non-
                              empty diff
  -d, --dataset TEXT          Specify the ID of the dataset to manage.
  -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
  -h, --help                  Show this message and exit.

Apply

Usage: bqdm apply [OPTIONS] [CONF_DIR]

  Builds or changes datasets.

Options:
  -d, --dataset TEXT              Specify the ID of the dataset to manage.
  -e, --exclude-dataset TEXT      Specify the ID of the dataset to exclude from managed.
  -m, --mode [select_insert|select_insert_backup|replace|replace_backup|drop_create|drop_create_backup]
                                  Specify the migration mode when changing the schema.
                                  Choice from `select_insert`,
                                  `select_insert_backup`, `replace`, r`eplace_backup`,
                                  `drop_create`,
                                  `drop_create_backup`.  [required]
  -b, --backup-dataset TEXT       Specify the ID of the dataset to store the backup at migration
  -h, --help                      Show this message and exit.

NOTE: See migration mode

Destroy

Usage: bqdm destroy [OPTIONS] COMMAND [ARGS]...

  Specify subcommand `plan` or `apply`

Options:
  -h, --help  Show this message and exit.

Commands:
  apply  Destroy managed datasets.
  plan   Generate and show an execution plan for...

Destroy plan

Usage: bqdm destroy plan [OPTIONS] [CONF_DIR]

  Generate and show an execution plan for datasets destruction.

Options:
  --detailed-exitcode         Return a detailed exit code when the command exits.
                              When provided, this argument changes
                              the exit codes and their meanings to provide
                              more granular information about what the
                              resulting plan contains:
                              0 = Succeeded with empty diff
                              1 = Error
                              2 = Succeeded with non-
                              empty diff
  -d, --dataset TEXT          Specify the ID of the dataset to manage.
  -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
  -h, --help                  Show this message and exit.

Destroy apply

Usage: bqdm destroy apply [OPTIONS] [CONF_DIR]

  Destroy managed datasets.

Options:
  -d, --dataset TEXT          Specify the ID of the dataset to manage.
  -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
  -h, --help                  Show this message and exit.

Migration mode

select_insert

TODO

LIMITATIONS: TODO

select_insert_backup

TODO

LIMITATIONS: TODO

replace

TODO

LIMITATIONS: TODO

replace_backup

TODO

LIMITATIONS: TODO

drop_create

TODO

drop_create_backup

TODO

Authentication

See authentication section in the official documentation of google-cloud-python.

If you're running in Compute Engine or App Engine, authentication should "just work".

If you're developing locally, the easiest way to authenticate is using the Google Cloud SDK:
$ gcloud auth application-default login
Note that this command generates credentials for client libraries. To authenticate the CLI itself, use:
$ gcloud auth login
Previously, gcloud auth login was used for both use cases. If your gcloud installation does not support the new command, please update it:
$ gcloud components update
If you're running your application elsewhere, you should download a service account JSON keyfile and point to it using an environment variable:
$ export GOOGLE_APPLICATION_CREDENTIALS="/path/to/keyfile.json"

Testing

Depends on the following environment variables:

$ export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json
$ export GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID

Run test

$ pip install pipenv
$ pipenv install --dev
$ pipenv run pytest

Run test multiple Python versions

$ pip install pipenv
$ pipenv install --dev
$ pyenv local 3.6.5 3.5.5 3.4.8 2.7.14
$ pipenv run tox

TODO

Support encryption configuration for table
Support external data configuration for table
Schema replication
Integration tests

BigQuery-DatasetManager
Release 0.1.6

Release 0.1.6

0.1.6

0.1.5

0.1.4

0.1.3

0.1.2

0.1.1

0.1.0

0.0.2

0.0.1

Documentation

BigQuery-DatasetManager

Requirements

Installation

Resource representation

Dataset

Table

Directory structure

Usage

Export

Plan

Apply

Destroy

Destroy plan

Destroy apply

Migration mode

select_insert

select_insert_backup

replace

replace_backup

drop_create

drop_create_backup

Authentication

Testing

Run test

Run test multiple Python versions

TODO

Stats

Development practices

Releases

Contributors

BigQuery-DatasetManager Release 0.1.6

Release 0.1.6 Toggle Dropdown 0.1.6 0.1.5 0.1.4 0.1.3 0.1.2 0.1.1 0.1.0 0.0.2 0.0.1

Documentation

BigQuery-DatasetManager

Requirements

Installation

Resource representation

Dataset

Table

Directory structure

Usage

Export

Plan

Apply

Destroy

Destroy plan

Destroy apply

Migration mode

select_insert

select_insert_backup

replace

replace_backup

drop_create

drop_create_backup

Authentication

Testing

Run test

Run test multiple Python versions

TODO

Stats

Development practices

Releases

Contributors

BigQuery-DatasetManager
Release 0.1.6

Release 0.1.6

0.1.6

0.1.5

0.1.4

0.1.3

0.1.2

0.1.1

0.1.0

0.0.2

0.0.1