Analyze and manipulate EEG data using PyEEGLab


Keywords
tensorflow, keras, eeg, dataset, preprocessing, eeg-data, mne-python, eeg-analysis, eeg-classification, eeg-signals-processing
Licenses
GPL-3.0/CERN-OHL-P-2.0
Install
pip install PyEEGLab==0.10.0

Documentation

PyEEGLab

DOI Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public. Build Status Documentation Status codecov Maintainability CodeFactor Gitpod Ready-to-Code

Analyze and manipulate EEG data using PyEEGLab.

Introduction

PyEEGLab is a python package developed to define pipeline for EEG preprocessing for a wide range of machine learning tasks. It supports set of datasets out-of-the-box and allow you to adapt your preferred one.

How it Works

Here is a simple quickstart:

from pyeeglab import *
dataset = TUHEEGAbnormalDataset()
preprocessing = Pipeline([
    CommonChannelSet(),
    LowestFrequency(),
    ToDataframe(),
    MinMaxCentralizedNormalization(),
    DynamicWindow(8),
    ToNumpy()
])
dataset = dataset.set_pipeline(preprocessing).load()
data, labels = dataset['data'], dataset['labels']

In this example, for each sample in the dataset, a common set of electrodes is selected, then downsampled to the lowest frequency and normalized using a min-max centralized approach. Each sample is then splitted in eight windows or frames.

This approach is quite usefull for tasks like artifact classification or seizure detection.

How to Install

PyEEGLab is distributed using the pip repository:

pip install PyEEGLab

If you need a bleeding edge version, you can install it directly from GitHub:

pip install git+https://github.com/AlessioZanga/PyEEGLab@develop

Out-Of-The-Box Supported Datasets

The following datasets will work upon downloading:

Dataset Size (GB) Class Distribution Task Notes
TUH Abnormal EEG Dataset 59.0 GB 'normal': 1521
'abnormal': 1472
Generic abnormal EEG events vs. normal EEG traces. This dataset does not contain any annotation, the event extraction is performed according to other papers that used this dataset: for each record a 60s sample is extracted and labelled according to the class of the file.
TUH Artifact EEG Dataset 5.5 GB 'null': 1940
'eyem': 606
'musc': 254
'elpp': 178
'chew': 161
'shiv': 60
Multiple artifacts vs. EEG baseline. At the moment, only the '01_tcp_ar' EEG reference setup can be used (more than ~95% of total records).
TUH Seizure EEG Dataset 54.0 GB 'fnsz': 4240
'gnsz': 1717
'cpsz': 1496
'tnsz': 334
'tcsz': 191
'mysz': 6
'absz': 2
Generic unclassified seizure type vs. specific seizure types. At the moment, only the '01_tcp_ar' EEG reference setup can be used (more than ~95% of total records).
Also, 'bckg' and 'scpz' classes are ignored: the former is just (a lot of) background noise, the latter has just one instance, which cannot be used with stratified cross-validation.
Motor Movement/Imagery EEG Dataset 3.4 GB Motor movement / imagery events. The size of this dataset will increase a lot during preprocessing: although its download size is fairly small, the records of this dataset are entirely annotated, meaning that the whole dataset is suitable for feature extraction, not just sparse events like the others datasets.
CHB-MIT Scalp EEG Dataset 43.0 GB 'noseizure': 545
'seizure': 184
No seizure events vs. seizure events. While for 'seizure' events there are (begin, end, label) records, the 'noseizure' class is computed by extracting a 60s sample from records that are flagged as 'noseizure'.

How to Class Meaning - From the TUH Seizure docs

Class Code Event Name Description
NULL No Event An unclassified event
SPSW Spike/Sharp and Wave Spike and wave/complexes , sharp and wave/complexes
GPED Generalized Periodic Epileptiform Discharges Diffused periodic discharges
PLED Periodic Lateralized Epileptiform Discharges Focal periodic discharges
EYBL Eye blink A specific type of sharp, high amplitude eye movement artifact corresponding to blinks
ARTF Artifacts (All) Any non-brain activity electrical signal, such as those due to equipment or environmental factors
BCKG Background Baseline/non-interesting events
SEIZ Seizure Common seizure class which can include all types of seizure
FNSZ Focal Non-Specific Seizure Focal seizures which cannot be specified with its type
GNSZ Generalized Non-Specific Seizure Generalized seizures which cannot be further classified into one of the groups below
SPSZ Simple Partial Seizure Partial seizures during consciousness; Type specified by clinical signs only
CPSZ Complex Partial Seizure Partial Seizures during unconsciousness; Type specified by clinical signs only
ABSZ Absence Seizure Absence Discharges observed on EEG; patient loses consciousness for few seconds (Petit Mal)
TNSZ Tonic Seizure Stiffening of body during seizure (EEG effects disappears)
CNSZ Clonic Seizure Jerking/shivering of body during seizure
TCSZ Tonic Clonic Seizure At first stiffening and then jerking of body (Grand Mal)
ATSZ Atonic Seizure Sudden loss of muscle tone
MYSZ Myoclonic Seizure Myoclonous jerks of limbs
NESZ Non-Epileptic Seizure Any non-epileptic seizure observed. Contains no electrographic signs.
INTR Interesting Patterns Any unusual or interesting patterns observed that don't fit into the above classes.
SLOW Slowing A brief decrease in frequency
EYEM Eye Movement Artifact A very common frontal/prefrontal artifact seen when the eyes move
CHEW Chewing Artifact A specific artifact involving multiple channels that corresponds with patient chewing, “bursty”
SHIV Shivering Artifact A specific, sustained sharp artifact that corresponds with patient shivering.
MUSC Muscle Artifact A very common, high frequency, sharp artifact that corresponds with agitation/nervousness in a patient.
ELPP Electrode Pop Artifact A short artifact characterized by channels using the same electrode “spiking” with perfect symmetry.
ELST Electrostatic Artifact Artifact caused by movement or interference on the electrodes, variety of morphologies.
CALB Calibration Artifact Artifact caused by calibration of the electrodes. Appears as a flattening of the signal in the beginning of files.
HPHS Hypnagogic Hypersynchrony A brief period of high amplitude slow waves.
TRIP Triphasic Wave Large, three-phase waves frequently caused by an underlying metabolic condition.
ELEC Electrode Artifact Electrode pop, Electrostatic artifacts, Lead artifacts.

How to Get a Dataset

WARNING: Retriving the TUH EEG datasets require valid credentials, you can get your own at: https://www.isip.piconepress.com/projects/tuh_eeg/html/request_access.php.

Given the dataset instance, trigger the download using the "download" method:

from pyeeglab import *
dataset = TUHEEGAbnormalDataset()
dataset.download(user='USER', password='PASSWORD')
dataset.index()

then index the new downloaded files.

It should be noted that the download mechanism work on Unix-like systems given the following packages:

sudo apt install sshpass rsync wget

Documentation

WIP: Documentation is currently Work-In-Progress, if you need additional info, please, contact me directly.

You can find the documentation at https://pyeeglab.readthedocs.io

Credits

If you use this code in your project use the citation below:

@misc{Zanga2019PyEEGLab,
    title={PyEEGLab: A simple tool for EEG manipulation},
    author={Alessio Zanga},
    year={2019},
    doi={10.5281/zenodo.3874461},
    url={https://dx.doi.org/10.5281/zenodo.3874461},
    howpublished={\url{https://github.com/AlessioZanga/PyEEGLab}},
}

Related publications