Maker.io main logo

How to Create a Custom Wake Word for Mycroft AI

2022-04-18 | By ShawnHymel

License: Attribution Raspberry Pi

In our last project, we created a custom skill for the Mycroft AI voice assistant. This involved setting up the Raspberry Pi specific instance of Mycroft known as “Picroft” and using the built-in tools to create a skill template. From there, we wrote some Python code to control hardware whenever certain phrases were uttered.

At the time, we were limited to using just a few keywords (the most popular being “hey, Mycroft”). What happens if you want to create a custom wake word for your smart speaker or robot project?

This tutorial will walk you through the process of training a custom wake word for the Mycroft AI voice assistant. We will look at two different methods: changing some text based on PocketSphinx and the more complicated (but more accurate) Precise engine.

See here if you would like to see the wake word training in video form:

 

Required Hardware

You will need a Raspberry Pi 3B+ or 4B for this tutorial. You are also welcome to connect the servo and LED hardware as per the previous tutorial, but it is not necessary.

Prerequisites

Make sure you follow the previous guide to set up and configure Mycroft.

Start Mycroft and login via SSH. You should be presented with the Mycroft CLI environment. 

SSH into Picroft

Press ctrl+c to exit out of the environment. Stop the Mycroft services by entering:

Copy Code
mycroft stop

 PocketSphinx

The easiest method of creating a custom wake word is to use the PocketSphinx engine that is built into Mycroft. It performs simple speech-to-text and compares the input sound to a series of phonemes that you define in a text file. While it’s easy to change the text file, note that PocketSphinx is not very accurate.

Head to https://www.nltk.org/_modules/nltk/corpus/reader/cmudict.html to figure out the phonemes for your desired wake word. For example, if I want to make the wake word “Hey, Jor’von,” the phonemes would be “HH EY . JH OW R V AA N .” Note the use of the period after each word that denotes a breakpoint between the sounds.

All we need to do is edit the user configuration file for Mycroft, set the wake word engine to PocketSphinx, and change the phonemes.

Copy Code
mycroft-config edit user

Change the contents of that file to the following:

Copy Code
{
"max_allowed_core_version": 19.2,
"listener": {
"wake_word": "hey jorvon"
},
"hotwords": {
"hey jorvon": {
"module": "pocketsphinx",
"phonemes": "HH EY . JH OW R V AA N .",
"threshold": 1e-12
}
}
}

Note that the “threshold” variable can be changed to adjust the sensitivity of the wake word. A smaller value (e.g. 1e-18) will make the engine less sensitive, which will mean fewer false positives, but it will ignore the wake word more often. A larger value (e.g. 1e-9) will mean the engine is more sensitive, which means it should hear the phrase more often, but it might also trigger on other phrases (false positives). Feel free to try different values.

Save and exit (ctrl+x and ‘y’ if you are using nano). Restart the Mycroft services with:

Copy Code
mycroft start all
mycroft-cli-client

You should be presented with the Mycroft CLI environment again. Once all the skills have loaded, try saying your new wake word (or phrase). In my experience, you have to say the phrase with a very particular tone, pacing, and inflection to have it trigger. 

As you might have guessed, PocketSphinx is not very accurate or robust. By default, Mycroft uses the Precise wake word engine for listening for the key phrase, which is much more accurate, but training a new wake word model for it is a lot more difficult.

Set Up Virtual Machine

Precise uses a recurrent neural network (RNN) to listen for wake words. We must train that model and deploy it to our Mycroft instance.

The Precise training program requires very specific versions of Python and various libraries. I’m going to walk you through the process that worked for me. Deviate from it at your own risk of ending up in dependency hell.

Install the latest version of VirtualBox. We are going to use a virtual machine to perform training.

Download the 64-bit Desktop image of Ubuntu 18.04.6. Note the specific version of Ubuntu I used here. Different versions of Ubuntu come with different versions of Python. I found that if I tried a different version of Ubuntu, I ended up fighting Python and library versions for hours. Stick to 18.04.6.

In VirtualBox, create a new virtual machine of type Linux and version Ubuntu (64-bit).

Configure virtual machine

Set the RAM to at least 4096 MB.

Configure virtual machine

Select Create a virtual hard disk now on the next screen and VDI on the screen after that. I used Fixed size for the physical disk and set the disk size to at least 20.00 GB.

Configure virtual machine

Click Create and wait while your virtual machine is created.

Configure virtual machine

Select your new virtual machine and click the Start button. VirtualBox should ask you which .iso image you want to use. Select your newly downloaded Ubuntu 18.04 image.

Install Ubuntu in VirtualBox

Click Start. Once Ubuntu boots up, you should see the installation wizard.

Ubuntu installation wizard

Click Install Ubuntu. Follow the wizard, accepting all the defaults. It will reformat and install on your 20 GB virtual disk.

When you are done, start your virtual machine and log in to Ubuntu. You should be presented with the desktop.

Ubuntu desktop in virtual machine

Configure Ubuntu

You will need to change a few things and install some dependencies before performing training.

If you are running Windows as your host OS, you might see a green turtle icon at the bottom of your VirtualBox window when running Ubuntu.

VirtualBox needs Hyper-V disabled

This means that you have Hyper-V running on your host machine, which will prevent TensorFlow from running inside your virtual machine. Follow the steps here to disable Hyper-V.

You may see some errors whenever you boot into your Windows host as a result of disabling Hyper-V. Once you are done training, you can follow these steps to re-enable Hyper-V.

Next, you will want to install Guest Additions, which will give you access to shared folders. This will make copying files (such as your dataset and model files) between guest and host operating systems much easier.

Open a terminal window and enter the following:

Copy Code
sudo apt update
sudo apt install build-essential dkms linux-headers-$(uname -r)

On your VirtualBox window, select Devices > Insert Guest Additions CD Image

Install VirtualBox Guest Additions

You should see a pop-up window. Click Run to run the Guest Additions CD. You will be asked to enter your user password, and Guest Additions should be automatically installed.

When it’s done, select Devices > Shared Folders > Shared Folders Settings. Click the Adds new shared folder icon on the right side. Create a folder on your host OS, give the folder a name, select Auto-mount, give it a mount point (such as /media/shared), and select Make Permanent.

VirtualBox add shared folder

Click OK twice to exit and save your shared folder settings. Enter the following into a terminal:

Copy Code
sudo adduser $USER vboxsf
sudo reboot

Now, any files you copy to that shared folder will appear in both your host and guest OS.

By default, Ubuntu 18.04 comes with Python 3.6. We need to update it to 3.7, but we want to make sure that Gnome can still use 3.6. In a terminal, enter:

Copy Code
sudo apt install -y python3.7
sudo nano /usr/bin/gnome-terminal

In that gnome-terminal file, change

Copy Code
#!/usr/bin/python3

to

Copy Code
#!/usr/bin/python3.6

Save and exit. Next, enter:

Copy Code
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 1
python3 --version

You should see that `python3` is now pointing to version 3.7.

Install Mycroft Precise

Now, we can install dependencies and the Precise training tools. In a terminal, enter:

Copy Code
sudo apt install -y git python3.7-dev
sudo add-apt-repository universe
mkdir ~/Projects
cd ~/Projects
git clone https://github.com/MycroftAI/mycroft-precise.git
cd mycroft-precise
nano setup.py

In that setup.py file, change:

Copy Code
h5py

to

Copy Code
h5py<3.0.0

Save and exit. Finally, run:

Copy Code
./setup.sh

Wait while everything installs. It will take some time.

Create Dataset

You will need lots of data:

  • At least 1000 samples of your desired wake word
  • At least 1000 samples of people saying things that are not your wake word
  • At least 1000 samples of random background noises (not talking)

Follow the video to see how you can use Audacity to cut apart a longer recording into several shorter sound clips.

You are welcome to use this dataset curation tool to help you make the set. Your .wav files should be 16 kHz sampling rate, 16-bit PCM, and at least 1 second long. The curation tool will automatically convert your sound files to the correct format.

You should rearrange the files to be in the following folder structure:

Copy Code
hey-jorvon/
|-- wake-word/
| |-- hey_jorvon.0001.wav
| |-- hey_jorvon.0003.wav
| |-- hey_jorvon.0004.wav
| |-- ...
|-- not-wake-word/
| |-- _noise.0000.wav
| |-- _noise.0001.wav
| |-- ...
| |-- _unknown.0001.wav
| |-- _unknown.0003.wav
| |-- ...
|-- test/
|-- wake-word/
| |-- hey_jorvon.0000.wav
| |-- hey_jorvon.0002.wav
| |-- hey_jorvon.0007.wav
| |-- ...
|-- not-wake-word/
|-- _noise.0011.wav
|-- _noise.0015.wav
|-- ...
|-- _unknown.0000.wav
|-- _unknown.0002.wav
|-- ...

Copy your dataset to your Ubuntu virtual machine. The easiest way to do this is to copy it into your shared folder. Note that you cannot train with the dataset in the shared folder, so you will probably need to copy it somewhere else in Ubuntu (e.g. ~/Projects).

Train Model

Make sure your dataset is in your Projects directory. For example, if you’re training with the *hey-jorvon* dataset, it should be located at ~/Projects/hey-jorvon.

Open a terminal and enable the Precise virtual environment with the following:

Copy Code
cd ~/Projects
source mycroft-precise/.venv/bin/activate

Perform training by entering:

Copy Code
precise-train -e 800 hey-jorvon.net hey-jorvon/ 

You are welcome to play with the parameters for the training tool:

  • -e number of epochs
  • -s target sensitivity (default: 0.2)

To test your model, you can use the *precise-test* tool:

Copy Code
precise-test hey-jorvon.net hey-jorvon/

It will use the test/ directory found in your hey-jorvon/ dataset folder to test your model. Check your number of false positives. Does it seem reasonable? Over 0.90 accuracy on the test set is a good start.

If you are finding that your model is not very accurate during testing or deployment, you will likely need to add more data. This can include adding samples that sound like your target wake word (e.g. “Hey, Jarvis” or “Hey, moron”). See this article for some great tips on making a good wake word training dataset. The author was able to achieve over 99.9% accuracy by collecting the right kind of data!

Deploy Custom Wake Word Model

Now that you’re ready to deploy your model, enter the following command:

Copy Code
precise-convert hey-jorvon.net

This will produce hey-jorvon.pb and hey-jorvon.pb.params files. Copy both of these out of your virtual machine (i.e. using your shared folder).

Copy the .pb and .pb.params files to ~/.local/share/mycroft/precise in your Mycroft instance (e.g. Picroft).

Check your version of Precise. Log into your Picroft via SSH and enter:

Copy Code
cd ~/.local/share/mycroft/precise
ls

If you see that the Precise engine file is less than 0.3 (it will likely be 0.2 if you were following the previous guide), you will need to update the engine so it will work with our newly trained model. Enter:

Copy Code
mv precise-engine precise-engine.bak
mkdir precise-engine
wget https://github.com/MycroftAI/mycroft-precise/releases/download/v0.3.0/precise-engine_0.3.0_armv7l.tar.gz
tar -xzf precise-engine_0.3.0_arm7l.tar.gz

Now, we can edit the Mycroft config file to use Precise with our new wake word model:

Copy Code
mycroft-config edit user

Change the contents of that file to be:

Copy Code
{ "max_allowed_core_version": 21.2, "listener": {
"wake_word": "hey jorvon"
},
"hotwords": {
"hey jorvon": {
"module": "precise",
"local_model_file": "/home/pi/.local/share/mycroft/precise/hey-jorvon.pb",
"sensitivity": 0.5,
"trigger_level": 3
}
}
}

Note that we tell it the location of our wake word model (hey-jorvon.pb). The parameters file (hey-jorvon.pb.params) must be in the same directory as that model file. 

You can adjust the sensitivity and trigger_level to make the wake word engine more or less sensitive.

  • sensitivity: between 0.0 and 1.0, higher values mean more sensitive
  • trigger_level: between 1 and infinity, higher values mean that Mycroft is less likely to activate unintentionally

Save and exit. Restart Mycroft:

Copy Code
mycroft-start restart all
mycroft-cli-client

Try saying your new wake word (in my example, it is “Hey, Jor’von”) to see if Mycroft responds! You should see the wake word text that you set in the configuration file appear in the console.

Mycroft AI with custom wake word

Going Further

The following guides should help you if you get stuck during the training process or need additional tips:

制造商零件编号 3367
MINI USB MICROPHONE
Adafruit Industries LLC
¥48.43
Details
Add all DigiKey Parts to Cart
TechForum

Have questions or comments? Continue the conversation on TechForum, DigiKey's online community and technical resource.

Visit TechForum