How to Make an AI-powered Artificial Nose
2022-08-15 | By ShawnHymel
License: Attribution Arduino
In recent years, artificial intelligence (AI) has come a long way. We can now run many machine learning algorithms (including some fairly complex deep learning models) on microcontrollers! As a result, we can mix together readings from a variety of sensors (known as “sensor fusion”) to train a device that can classify odors and make decisions.
This tutorial will show you how to create your own machine learning model using gas data to detect different types of odors. In essence, we are using machine learning to perform sensor fusion.
This build is based on Benjamin Cabe’s Artificial Nose Project. The source code for his project can be found here. I wanted to start with his design before trying to take it a step further.
All the code in this tutorial can be found in this repository.
Feel free to watch this video if you would like to see me build the project:
Sensor Fusion
Sensor fusion (sometimes called “data fusion”) is the process of mixing together raw readings from different sensors to help a machine have a better understanding of the world around it. For example, we can use an algorithm (likely a Kalman filter) to mix the readings from a 3-axis accelerometer, 3-axis gyroscope, and 3-axis magnetometer (together known as a 9-degrees-of-freedom inertial measurement unit, or 9-DoF IMU) to give us the absolute orientation of the sensor(s) in space.
The Kalman filter can be quite complex and require a lot of tuning to get right. If we don’t know exactly how to combine sensor readings to get an output we want, we can turn to our trusty machine learning to find the rules for us!
In this case, we do not need to know the exact, absolute readings from our sensors in order to learn what combination of data can be used to predict an odor. It’s a simple classification task: given some data, predict the class that the data belongs to. Relative readings are good enough.
However, we run into issues when the readings can vary with temperature and humidity. If there is an equation we can use to perform temperature and humidity compensation, then we should use that. However, without an easy equation, we can try feeding temperature and humidity data into our neural network and hopefully have it learn the necessary relationships.
Warning: this can be tricky and require a LOT of training data! I only collected from areas with 3 different temperature/humidity readings. To make this model more accurate, you would want to either perform actual temperature/humidity compensation (to get the absolute gas measurements) or collect thousands of datapoints from different temperatures and humidities,
Required Hardware
You will need the following hardware:
- Wio Terminal
- Grove - Multichannel Gas Sensor v2
- Grove - SPG30 VOC and eCO2 Gas Sensor
- Grove - BME680 Temperature, Humidity, and Pressure Sensor
- Grove - I2C Hub
Hardware Connections
Simply connect all of the sensors to the hub, and connect the hub to the Wio Terminal.
Data Collection
Copy the Arduino code found here to a new Arduino project and upload it to your Wio Terminal. If you open a serial terminal, you should see 8 raw readings fly by every few seconds.
Important: the gas sensors have a heating element that is required to preheat before gathering useful data. The datasheets claim you’ll need about 24 hours of preheating. I found that 10 min was good enough for our purposes (remember: we just need relative data).
Make sure to close out of the serial terminal. Copy this Python script to a new file on your computer (note: you will need to have Python 3.x installed). Find the USB serial port associated with your Wio Terminal. With the sensors over/in the odor, run the script:
python serial-data-collect-csv.py -p <PORT> -b 115200 -d dataset -l <LABEL>
Where <PORT> is the port for your Wio Terminal (e.g. COM9) and <LABEL> is the odor you are collecting (e.g. coffee).
Let this run for about 1 min. Switch to a new odor, change the label, and repeat the collection process.
My raw dataset is located here, if you would like to use it. However, note that it was collected in my environment (a particular temperature and humidity), so it will likely not work for you.
Feature Scaling
Unfortunately, the data scales for the different sensors are all over the place (some are voltages in the range of 0-2 V, some are temperatures in the range of 25-32 deg C, and others are ppm/ppb that can range in the thousands). The different ranges will cause issues for machine learning model training, as larger numbers will have a disproportionate impact on the outcome. So, we should scale all of the features to be within a similar range.
The most common techniques are:
- Standardization: used for normally distributed (Gaussian) data. Shift the distribution so that it has a mean of 0 and a variance of 1.
- Normalization: used for non-Gaussian distributions. Shift the range of the data to be between [0, 1].
You can read more about feature scaling here.
To perform feature scaling on our data, open this Google Colab notebook (you will need a Gmail account). Run through all of the cells, following the directions. It should be configured to drop the pressure data and perform normalization on everything except the timestamp (which is not used as training data).
Once done, the newly scaled data should be stored in an archive titled out.zip. Download and unzip it.
Training the Model
Start a new project on Edge Impulse and upload the CSV files. Let the tool automatically split between training and test set, and let it read the label data from the files (as these should have been set when using the serial-data-collect-csv.py script).
Under “Impulse design,” add the “Flatten” DSP block. For the learning blocks, you will want the “Classification (Keras)” block and the “Anomaly Detection (K-means)” block. Make sure that you have all of the input features selected in each of the blocks.
Go to the “Flatten” screen. Deselect “Skewness” and “Kurtosis.” Select “Save parameters” and then “Generate features” on the next screen.
Go to the “NN Classifier” screen. Change the neural network to have 40 neurons in the first layer and 20 neurons in the second layer. Add a dropout layer (with a dropout rate of 0.25) after each Dense layer. Change the training cycles to 300. Click “Start training.”
After training is done, go to the “Anomaly detection” screen. Select the RMS features for each sensor reading (this might be listed as “feature 3”). Click “Start training.”
When that’s done, head to the “Model testing” page. Click “Classify all.” When the process is complete, feel free to look at the confusion matrix to see how well your model performed.
My model is not great–it is capable of classifying broad categories, such as coffee, tea, and spirits. However, it struggles to identify the different types of spirits. Feel free to try adjusting the model parameters to see if you can create a better model.
Deploy the Model
Go to the “Deployment” page. Select “Arduino library” and click “Build” at the bottom of the page.
When the download is complete, open Arduino and select Sketch > Include library > Add .ZIP library… Select the newly downloaded Edge Impulse library.
Copy the code found here to a blank Arduino sketch. Change the following line to match the name of the library you installed from Edge Impulse:
#include "ai-nose_inferencing.h"
Change the following lines to match the values of the minimums and ranges from the “Feature scaling” step:
// Preprocessing constants (drop the timestamp column)
float mins[] = {
27.93, 36.17, 400.0, 0.0, 1.56, 0.81, 1.51, 0.61
};
float ranges[] = {
12.54, 18.74, 56930.0, 60000.0, 1.48, 2.05, 1.6, 2.68
};
Upload the code to the Wio Terminal. Whenever you hold the sensors in one of the identifiable odors, the LCD on the Wio Terminal should indicate the label.
If you open a Serial Terminal, you should see the raw classification results.
Going Further
You can read more about Benjamin’s artificial nose project here. Additionally, here are the datasheets for the sensors, if you would like to learn more about them.
- Seeed Studio Multichannel Gas Sensor v2
- Sensirion VOC and equivalent CO2 gas sensor
- Bosch BME680 sensor
Here are a couple of videos to help you get started understanding how neural networks can be trained and deployed to an Arduino (without Edge Impulse):
- Intro to TinyML Part 1: Training a Neural Network for Arduino in TensorFlow
- Intro to TinyML Part 2: Deploying a TensorFlow Lite Model to Arduino
Have questions or comments? Continue the conversation on TechForum, DigiKey's online community and technical resource.
Visit TechForum