NVIDIA Jetson Nano - Part 2: Image Classification with Machine Learning

2019-12-23 | By ShawnHymel

License: Attribution Jetson

The NVIDIA Jetson Nano is a single-board computer based on the NVIDIA Tegra X1 processor, which combines CPU and GPU capabilities. As a result, it is a great starting platform for doing Edge AI.

If you have not done so already, please follow the steps in the previous tutorial to install Linux and configure your Jetson Nano: Getting Started with the NVIDIA Jetson Nano - Part 1: Setup.

Here is a YouTube link if you would like to view these steps in video format:

Note that for the following demos, you will want to use a keyboard, mouse, and monitor connected directly to the Jetson Nano. Otherwise, the camera feed will be extremely slow over a network connection.

Live Detection Demo

If you downloaded the COCO models in the previous episode, then you can use them to identify and detect objects in real time. To do that, we use the DetectNet-Console tool.

First, make sure you have a camera plugged into your Jetson Nano. This can be a CSI camera (the Raspberry Pi Camera Module V2 supposedly works well) or a USB webcam (the Logitech c920 worked for me).

Open a terminal and navigate to the bin directory in aarch64:

Copy Code

cd ~/jetson-inference/build/aarch64/bin/

From there, run the live camera tool. Note that you will need to set the camera parameter to your connected camera: a USB webcam will likely be the device file /dev/video0, or if you’re using a CSI camera, it will be just 0 or 1.

Copy Code

./detectnet-camera.py --network=coco-dog --camera=/dev/video0

This will start a live feed from your camera. It will also attempt to locate any objects in the frame that match the dog model found in the coco-dog network.

Live image detection and classification

If you use an image of a dog (or a real dog), the program should be able to detect it, label it as a dog, and put a blue bounding box around it.

Training a Model

Because the Nano is an embedded device, it is not nearly as powerful as a modern desktop or server built with a powerful graphics card. As a result, if you plan to train a deep neural network (or other large model) from scratch, we recommend doing so from a laptop, desktop, or server.

NVIDIA has a training interface called DIGITS that makes training networks much easier. This guide will walk you through training deep neural networks from scratch.

That being said, we can do something called “transfer learning” to retrain an existing network. When we do this, we just tweak the model’s parameters to optimize it to our own training data.

To begin, we first need to set up a swap space on our SD card so that we have more RAM to play with. Make sure you have at least 4 GB available on your SD card by running the following command:

Copy Code

df -h

Next, create a mount the swap partition:

Copy Code

sudo fallocate -l 4G /mnt/4GB.swap
sudo chmod 0600 /mnt/4GB.swap
sudo mkswap /mnt/4GB.swap
sudo swapon /mnt/4GB.swap

If you want the swap file to mount on boot, you will need to modify fstab:

Copy Code

sudo vi /etc/fstab

Scroll to the bottom of this file and press ‘o’ to insert a new line and begin editing. Enter the following line:

Copy Code

/mnt/4GB.swap  none swap sw 0  0

You can check to see if the swap space mounted with:

Copy Code

swapon -s

You should see the 4GB.swap file listed.

Next, we need to capture images to create our datasets. I’ll be using 3 different sets of images, as I want my network to identify these categories:

Background
Fork
Spoon

Note that if you are training the network to identify objects, you should take pictures of them in similar backgrounds. With such little data, the network will be sensitive to new backgrounds, new lighting, etc.

To use the jetson-inference capture tool, first create our datasets directory and labels file:

Copy Code

cd ~
mkdir datasets
cd ~/datasets
mkdir utensils
cd utensils
touch labels.txt
echo “background” >> labels.txt
echo “fork” >> labels.txt
echo “spoon” >> labels.txt

Note that the categories in the labels file need to be on separate lines and in alphabetical order! You can check them with:

Copy Code

cat labels.txt

Next, run the camera-capture tool. If you’re using a USB webcam, you will want to use the /dev/video0 device file. If you’re using CSI camera, change the camera parameter to 0 or 1 (whichever one works, such as --camera=0). I’m also using a much lower resolution, as it allows for faster training and later classification:

Copy Code

camera-capture --camera=/dev/video0 --width=640 --height=480

In the capture tool, first point the Dataset Path to your ~/datasets/utensils directory. Then, point the Class Labels to the ~/datasets/utensils/labels.txt file. Select your Current Class (e.g. start with “background”). For Current Set, select train. Use the spacebar or button to capture at least 30 images of your intended background.

Capture background images with Jetson nano

Next change the Set to val (for validation), and take at least 10 more photos of the same background. Change the Set to test and take yet another 10 photos of the background.

Repeat this process for your fork and spoon images, each time, holding up the desired utensil to the camera. You can move the utensil around slightly, but don’t move it too much, or the model will not be able to train on the image properly.

Taking photos of a fork with Jetson nano

In the end, you should have the following set of images:

Background

Train: 30 (or more) images
Val: 10 (or more) images
Test: 10 (or more) images

Fork

Train: 30 (or more) images
Val: 10 (or more) images
Test: 10 (or more) images

Spoon

Train: 30 (or more) images
Val: 10 (or more) images
Test: 10 (or more) images

Now, it’s time to train! Navigate to the classification directory and run the training program:

Copy Code

cd ~/jetson-inference/python/training/classification/
python train.py --model-dir=utensils ~/datasets/utensils

This can take up to 30 minutes, so be patient (or go get some coffee). When it’s done, we will need to export the model to the Open Neural Network Exchange (ONNX) format:

Copy Code

python onnx_export.py --model-dir=utensils

Test It!

With the model trained, we can use it to make classification predictions! Run the following program (changing /dev/video0 for your particular camera):

Copy Code

imagenet-camera --model=utensils/resnet18.onnx --labels=/home/sgmustadio/datasets/utensils/
labels.txt --camera=/dev/video0 --width=640 --height=480 --input_blob=input_0 --output_blob=output_0

It can take around 5 minutes for the engine to start up, so be patient with this one, too. Once you get a live stream of your camera, make sure it is facing the background that you trained it on. Then, hold up a fork or spoon in front of the camera. It should be able to identify the utensil!

Classifying spoon with Jetson Nano

Note that it probably won’t be very accurate--we used a very small training set!

Going Further

Try training the network on different objects! NVIDIA also has a number of other demos in their Hello AI World documentation that we recommend working through: https://github.com/dusty-nv/jetson-inference#hello-ai-world

Recommended Reading