blackline-core
Blackline’s is an open-source project that makes GDPR compliance easy for developers. Connect databases, define retention terms, and execute compliance according to your org’s requirements. With Blackline you can also run audit reports and keep up with data privacy regulations. Check out our docs to get started!
Features
- Supports common databases out of the box and is flexible enough to easily add support for additional data stores.
- Is datastore-agnostic, ensuring compatibility across a wide range of data storage solutions.
- Eliminates the need to write custom queries, saving development time and effort.
- Offers an easy-to-define and read solution.
- Provides a consolidated, single point of collaboration for managing data privacy compliance.
- Has been thoroughly tested for specific edge cases.
To come:
- Provides a highly flexible solution, allowing for the injection of custom SQL.
- Handles the orchestration between complex data dependencies.
Docs
Please check the blackline documentation!
Installation
Install the latest version of blackline-core
pip install blackline-core
blackline-core
only includes an adapters for SQLite
databases. To include additional data stores please pip install blackline-<adapter name>
. For example the postgres adapter is installed using
pip install blackline-postgres
Multiple adapters can be installed. Please see the supported platforms section for more information regarding adapters.
Quickstart
Requirements
- Python 3.9+
Simple project
- First setup a blackline project with am example folder structure. You can change the names of these folder but it is important that the folder layout for the adapters and catalogue mirror each other with the same names.
foo@bar:~$ pip install blackline
foo@bar:~$ blackline init -p quickstart
Initialized blackline project at: quickstart
# apt-get install tree
foo@bar:~$ tree quickstart
quickstart/
├── adapters
│ └── organization
│ └── system
│ └── resource
│ └── dataset.yaml
├── blackline_project.yml
└── catalogue
└── organization
├── organization.yaml
└── system
├── resource
│ ├── dataset
│ │ └── dataset.yaml
│ └── resource.yaml
└── system.yaml
9 directories, 6 files
- Create a sample database for this quickstart
foo@bar:~$ blackline sample -p quickstart --data-only
Created sample data at: quickstart
- Define the adapter config.
profiles:
dev:
type: sqlite
config:
connection:
database: "blackline_sample.db"
uri: true
- Check your connection.
foo@bar:~$ blackline debug -p quickstart --profile dev
Testing connections for profile: dev
dataset: good
- Explore the sample data.
from sqlite3 import connect
conn = connect("blackline_sample.db")
tables = conn.execute(
"SELECT name FROM sqlite_master WHERE type='table';"
).fetchall()
user = conn.execute("SELECT * FROM user")
shipment = conn.execute("SELECT * FROM user")
print([column[0] for column in user.description])
['id', 'name', 'email', 'ip', 'verified', 'created_at']
print(user.fetchall())
[
('00', 'Bar', 'bar@example.com', '555.444.3.2', 1, '2021-02-01 00:00:00'),
('01', 'Biz', 'biz@example.com', '555.444.3.3', 1, '2022-06-01 00:00:00'),
('02', 'Baz', 'baz@example.com', '555.444.3.4', 0, '2022-02-01 00:00:00'),
('03', 'Cat', 'cat@example.com', '555.444.3.5', 1, '2023-01-01 00:00:00'),
('04', 'Dog', 'dog@example.com', '555.444.3.6', 0, '2023-01-01 00:00:00')
]
print([column[0] for column in shipment.description])
['id', 'user_id', 'order_date', 'street', 'postcode', 'city', 'status']
print(shipment.fetchall())
[
('00', '01', '2022-06-01 00:00:00', 'Ceintuurbaan 282', '1072 GK', 'Amsterdam', 'delivered'),
('01', '02', '2022-03-01 00:00:00', 'Singel 542', '1017 AZ', 'Amsterdam', 'delivered'),
('02', '02', '2022-04-15 00:00:00', 'Singel 542', '1017 AZ', 'Amsterdam', 'delivered'),
('03', '03', '2023-01-05 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'delivered'),
('04', '03', '2023-01-06 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'returned'),
('05', '03', '2023-01-06 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'delivered')
]
- Define the catalogue.
organization:
- key: organization_demo
system:
- key: system_demo
resource:
- key: resource_demo
resource_type: Service
privacy_declarations:
- name: Analyze customer behaviour for improvements.
data_categories:
- user.contact
- user.contact.address
data_use: improve.system
data_subjects:
- customer
data_qualifier: identified_data
dataset:
- key: demo_db
name: Demo Database
description: Demo database for Blackline
collections:
- name: user
description: User collection
datetime_field:
name: created_at
fields:
- name: name
description: Name of user
deidentifier:
type: redact
period: P365D
- name: email
deidentifier:
type: replace
value: fake@email.com
period: P365D
- name: ip
deidentifier:
type: mask
value: "#"
period: 280 00
- name: shipment
datetime_field:
name: order_date
fields:
- name: street
deidentifier:
type: redact
period: P185D
- Run de-identification from the root of the blackline project.
foo@bar:~$ cd quickstart
foo@bar:~$ blackline run --profile default --start-date 2023-01-01
Running project: /quickstart
Running profile: default
Running start date: 2023-01-01 00:00:00
Finished project: /quickstart
- Explore the de-identified data.
from sqlite3 import connect
conn = connect("blackline_sample.db")
tables = conn.execute(
"SELECT name FROM sqlite_master WHERE type='table';"
).fetchall()
user = conn.execute("SELECT * FROM user")
shipment = conn.execute("SELECT * FROM shipment")
print([column[0] for column in user.description])
['id', 'name', 'email', 'ip', 'verified', 'created_at']
print(user.fetchall())
[
('00', None, 'fake@email.com', '###.###.#.#', 1, '2021-02-01 00:00:00'),
('01', 'Biz', 'biz@example.com', '555.444.3.3', 1, '2022-06-01 00:00:00'),
('02', 'Baz', 'baz@example.com', '###.###.#.#', 0, '2022-02-01 00:00:00'),
('03', 'Cat', 'cat@example.com', '555.444.3.5', 1, '2023-01-01 00:00:00'),
('04', 'Dog', 'dog@example.com', '555.444.3.6', 0, '2023-01-01 00:00:00')
]
print([column[0] for column in shipment.description])
['id', 'user_id', 'order_date', 'street', 'postcode', 'city', 'status']
print(shipment.fetchall())
[
('00', '01', '2022-06-01 00:00:00', None, '1072 GK', 'Amsterdam', 'delivered'),
('01', '02', '2022-03-01 00:00:00', None, '1017 AZ', 'Amsterdam', 'delivered'),
('02', '02', '2022-04-15 00:00:00', None, '1017 AZ', 'Amsterdam', 'delivered'),
('03', '03', '2023-01-05 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'delivered'),
('04', '03', '2023-01-06 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'returned'),
('05', '03', '2023-01-06 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'delivered')
]
Contributing
This project is new and could use your help. Please open an issue or make a feature request.
Code of Conduct
If you would like to contribute, fork blackline-core, commit your changes, and make a pull request. It's a python project so we follow the PSF Code of Conduct. In general, be a decent and polite human.