BUILDING A SENSOR ANALYTICS PIPELINE

The increased availability of sensor data and technology creates a need for a robust analytics pipeline that can be utilized across a variety of applications. Using signal processing, machine learning, and data visualization techniques, we've built a pipeline that processes and classifies labeled sensor data.

Viewing our pipeline as a black box, any kind of sensor data can be fed in and produce classification outputs. This summer, we've focused mainly on electroencephalogram (EEG) data, or brain electrical activity data. To collect EEG data, we used the Muse headband, a consumer-grade brain sensor wearable.

What's our pipeline look like?
1. First, we collect our data using the Muse, which gets saved as a .csv file to Dropbox.

2. Then, we clean our data using a variety of artifact removal and noise reduction techniques.

3. Our cleaned data undergoes a variety of transformations, extracting features for our machine learning models in the process.

4. We train and test these models to output a cognitive state classification.

5. After getting results, we analyze model performance, going back and changing the pipeline as necessary.

6. Once our models perform well enough, we can use the pipeline to control some sort of output device, such as a light.
Here's some brain data.
We collected data while thinking about happy memories, and then while computing the Fibonacci sequence. Can we use machine learning to distinguish between the two?

A portion of the raw data is shown on the right. The Muse headband consists of four electrodes (labeled TP9, AF7, AF8, and TP10) that line up along the scalp and record voltage fluctuations.

Select an option to see what the raw data looks like for each electrode.

First, let's clean our data.
The Muse isn't perfect; it has a low signal-to-noise ratio and picks up artifacts (like blinking) that distort the pure EEG signal. We use a variety of artifact removal and noise reduction techniques to diminish the presence of eye movements, jaw clenching, and other environmental noise in the data. After the data is cleaned, the EEG signal looks much more prominent!
To create discrete samples, we use time-series segmentation.
To create inputs for our machine learning models, we must discretize time into these time segments. Each segment becomes a single input for our models. Here, we've divided the data into 1-second segments.
Then, we extract several features from each discrete time segment.
Let's represent each of these time segments as points, shown on the right. From each point, we extract a variety of features, such as the mean and variance for each electrode, as well as inter-electrode correlation. We end up getting 105 features in total, which is a lot! Not only will this slow down computation, but it might lead to overfitting and poorer classification results. Let's reduce the number of features down to two using primary component analysis (PCA).
We can plot each point by these two features to visualize any differences between the classes.
With PCA, we have a two-dimensional representation of the features in the data. Here, the blue points are "thinking about happy memories" while the red points are "computing the Fibonacci sequence". Our machine learning models will attempt to distinguish between these two. Ideally, we'd like to see the points of each class clearly clustering amongst themselves.
Here's all of our discrete time segments.
To make a classification, we're using a machine learning boosting method. That is, we've constructed a set of binary classification models (random forest, logistic regression, and neural network) that will each predict a data point's class. Then, each prediction will cast a (weighted) vote to make a classification: happy memories, or mental math.
Finally, let's see how our boosting method classified our testing data!
The highlighted points are data points that the machine learning models have erroneously classified, either a false positive or a false negative. Not too bad!
How did we do?
How do each of our models compare? We can plot the receiver operating characteristic, or ROC curve, which compares the false positive rate with the true positive rate at different discrimination thresholds. We can also look at a heat map of the confusion matrix, which shows how the models classified each data point. Ideally, we'd like to see darker colors in the top left and bottom right corners of the heat map, meaning that the model was able to correctly classify a majority of the data.

What next for the project?
- Document code and interface in order to use our work after we leave
- Extend interface to be used with more general sensor data
- Optimize code for large quantities of off-line data
- Prepare live demo
Trying it live!
Let's put on the Muse headband and see if we can get an accurate live classification of turning a light on and off.