The novelty of talking with Mme Google wears out. It would be nice to do something practical with voice recognition. Furthermore, who wants an open microphone streaming all sounds in the house to the outside world? It could be paranoia; but I would prefer so called "hot word" recognition to be done locally.
The preferred method seemed to be snowboy from KITT.AI. But I had problems. Fortunately, some clever people had already found solutions, all I had to do was find their site.
Table of Contents
- Audio Hardware
- Installing Python 3
- Installing Python Audio Prerequisites
- Installing snowboy
- snowboy with Domoticz
- Next?
In a previous post about Google Assistant, I showed how to set up the audio hardware on the Orange Pi Zero with an expansion board. If this is already done, skip to the next section.
Before powering up the OPiZ, plug in the expansion board and then connect powered speakers using the 3.5mm jack on it. The expansion board already contains a microphone. When the cube like case is used, it makes for a neat package which is smaller than the wallwart powering the speakers as can be seen below.
After opening an ssh
session as user dietpi
, I
made sure audio output was enabled and directed to the 3.5mm jack. This can be
done in dietpi-config
.
Select Audo Options
in the main menu.
If the Soundcard
is not set to
Analogue
then click on Ok
.
Then select default 3.5mm Analogue
.
A number of packages will be installed. The following will be displayed on the screen during their installation.
Alsa-utils
contains arecord
and aplay
that will be used to test the audio later. Now that the audio output has been
set to the default
(3.5mm analogue jack), it is time to go
Back
to the main menu.
Exit the configuration utility. It will be necessary to reboot.
After giving the OPiZ some time to reboot, log back in and ensure that
dietpi
is part of the audio
group.
Had dietpi
not been a member of audio
, it would
have been a simple matter to join the group:
The first step in testing the audio hardware is to record sounds through
the microphone to a temporary file sample.wav
.
Then it's playback time to ensure that the speakers are getting the audio output.
It works! And I did not have to adjust playback and recording volumes.
All that remains is to create the asound
configuration
file.
pcm.!default { type asym capture.pcm "mic" playback.pcm "speaker" } pcm.mic { type plug slave { pcm "hw:0,0" format S16_LE } } pcm.speaker { type plug slave { pcm "hw:0,0" } }
If you are following along from my previous post, then Python 3 is already installed. However the Google Assistant service should be disabled. Hot word detection is to be performed by snowboy, not Google.
If you are starting off with a fresh copy of Armbian from DietPi then the first step is to install Python version 3.
Next a Python 3 virtual environment is created.
The virtual environment env
is a directory in the
dietpi
home directory. It is activated by the
source /home/dietpi/env/bin/activate
and deactivated by the
deactivate
command. As the dialogue shows, activating the
virtual environment means that the command python
will
now invoke python3
. This is done with symbolic links and
modification of the search path and probably more tricks. That would
make it possible to have other versions of Python installed in other virtual
environments and using any version without interference.
The last step is to upgrade the installed versions of pip
and
setuptools
.
Of course, if the SD card contains a fresh copy of the operating system
the host name will not be domopiz
unless you changed it to that
particular name in the Security Options
of DietPi-Config
.
The next step is to get all the packages needed by Python to access the audio hardware. Some of these may have been installed previously; it does not matter, no harm will occur if an unnecessary supplementary installation is attempted.
It is time to test Python audio by recording some sound using the
microphone on the expansion board, by using a Python script rec
to record an audio file which is then played back.
I remain amazed at the quality of the little electret microphone on the OrangePi Zero expansion board. I recorded my voice by talking towards the screen at a normal conversational level while sitting at my desk. The OPiz was on the floor about a half meter behind me, yet the sound gets recorded albeit the playback level is not very high. The OPiZ was not in its case because of the heat dissipation problem; that may have an effect on sound recording.
The precompiled
Unfortunately, this is where I hit a wall. I could not get the demo script to run properly.
The Python 2.7 library that could not be found should have been a loud
signal, but I did not see it. Fortunately, I found a repository on github by Mihail Burduja who with the help of António Pereira had
the solution. The _snowboydetect.so
in the package from KITT.AI
is for Python 2.7. The developer provides the snowboy
source code, so presumably, it should be possible to recompile the library.
But it is not necessary to do so, we can use the Python 3 library in their
repositories. DietPi did not include git
in
its distribution. It is easy enough to install but I got lazy and
just got the zip file of the latest version from github.
Then it was just a matter of copying the Python 3 library into the
snowboy
directory and running the demo script again
IT WORKS!
Time to move on to the second demo script using "snowboy" and "alexa" as two hot words.
As can be seen, this worked also. Many thanks to Mihail Burduja (alias warchildmd) aided by António Pereira (alias Shaxine) for the information that finally helped me install snowboy on an Orange Pi Zero.
To continue with the LED examples in the documentation, a modified GPIO library for the OrangePi Zero, called
OPi.GPIO will have to be installed. Since the
Python 3 virtual environment is used, substitution of pip3
for pip
as suggested in the library documentation is not done.
OPi.GPIO is a drop-in replacement for the classic
Raspberry Pi GPIO Python library RPi.GPIO. So all
that needs to be done to use it is to edit change one letter in the first line
of the light.py
script.
Again, because the Python 3 virtual environment is used, the command to
invoke the light.py
script is not exactly what is shown in the
KITT.AI documentation. The script just blinks an
LED that is connected to GPIO 17 and ground.
There is more information about using sudo
in a Python
virtual environment on the ask ubuntu forum.
There will be a warning on subsequent runs of the script:
GPIO.setwarnings(False)
can be added
to the script before GPIO.setmode(GPIO.BCM)
lin in the
__init__
definition. I do not like doing that, it seems to me that
control of the GPIO channels should be relinquished (using GPIO.cleanup()
?)
when the object is destroyed. I know next to nothing about Python so I could
not do that correctly.
To control the LED with a spoken key word, I copied the demo.py
script to modify it as instructed. Changes are shown on a white background.
import snowboydecoder import sys import signalfrom light import Lightinterrupted = False def signal_handler(signal, frame): global interrupted interrupted = True def interrupt_callback(): global interrupted return interrupted if len(sys.argv) == 1: print("Error: need to specify model name") print("Usage: python demo.py your.model") sys.exit(-1) model = sys.argv[1] signal.signal(signal.SIGINT, signal_handler) detector = snowboydecoder.HotwordDetector(model, sensitivity=0.5) print('Listening... Press Ctrl+C to exit')led = Light(17) detector.start(detected_callback=led.blink, interrupt_check=interrupt_callback, sleep_time=0.03)
This script turns the LED on for a short duration each time the keyword is is detected.
With two hot words, it becomes possible to turn devices on and off.
Domoticz can be invoked with three relatively small
modifications to the demo2.py
script. The changes are shown with
a white background.
import snowboydecoder import sys import signalimport urllib.request# Demo code for listening two hotwords at the same time interrupted = False def signal_handler(signal, frame): global interrupted interrupted = True def interrupt_callback(): global interrupted return interrupted if len(sys.argv) != 3: print("Error: need to specify 2 model names") print("Usage: python demo.py 1st.model 2nd.model") sys.exit(-1) models = sys.argv[1:] # capture SIGINT signal, e.g., Ctrl+C signal.signal(signal.SIGINT, signal_handler)def detect_on(): urllib.request.urlopen('http://192.168.0.45:9071/json.htm?type=command¶m=udevice&idx=52&nvalue=1') snowboydecoder.play_audio_file(snowboydecoder.DETECT_DING) print('Device turned on.\nListening... Press Ctrl+C to exit') def detect_off(): urllib.request.urlopen('http://192.168.0.45:9071/json.htm?type=command¶m=udevice&idx=52&nvalue=0') snowboydecoder.play_audio_file(snowboydecoder.DETECT_DONG) print('Device turned off.\nListening... Press Ctrl+C to exit')sensitivity = [0.5]*len(models) detector = snowboydecoder.HotwordDetector(models, sensitivity=sensitivity)callbacks = [detect_on, detect_off]print('Listening... Press Ctrl+C to exit') # main loop # make sure you have the same numbers of callbacks and models detector.start(detected_callback=callbacks, interrupt_check=interrupt_callback, sleep_time=0.03) detector.terminate()
I saved this modified script under the name demo3.py
(download here. From now on, the lamp with
idx 52 can be turned on with the "snowboy" hot word, and turned off with
the "alexa" hot word.
Almost there, almost at the point of instructing Domoticz to turn devices on and off with voice commands using Google Assistant and hot word detection by snowboy. But... I ran into a glitch and this post is getting long anyway. So p> Almost there, almost at the point of instructing Domoticz to turn devices on and off with voice commands using Google Assistant and hotword detection by snowboy. However, I ran into a couple of glitches and this post is getting long anyway... and I am not sure I want to continue along this route.
Granted that sounds like an excuse for not solving my problems but it is true that I am of two minds about using Google Assistant. I am curious to use it and other cloud based voice API such as Bing. The latter appeals to me as it is polyglot and I would rather use French voice commands. However I am loath to depend on a mechanism that requires a working Internet connection to operate devices in my home. In last year's hardest hitting storm we lost electricity for a few hours, but our ISP was offline for three days.
Consequently, I am looking to a voice controlled solution that does not rely on external vocal recognition. One way would be a multilevel snowboy implementation. There could be two top-level keywords: "turn on" and "turn off". And then at the next level the keywords would be device names. Since snowboy can be trained to recognize new keywords it is all feasible in principle. And since snowboy is language agnostic, those keywords could be in French without any problem.
Another possibility is Pocket Sphinx an Open Source project from Carnegie Mellon University. There was a well-made report by Alan McDonley on running this package on a quad core (Arm Cortex A53) Rapsberry Pi 3 versus a single core Raspberry Pi B+. The gist seemed to be that results were not too bad on the older Pi and rather good on the newer Pi. So it certainly looks like the quad core (Arm Cortex A7) OrangePi Zero should be able to handle the task. Indeed some preliminary tests are encouraging: I have been able to install Pocket Sphinx and use it control lamps on and off through Domoticz.
Perhaps, in the end I will have the best of both worlds. There could be two keyword, "hey google" and "hey house" detected by snowboy which would pass on the rest of the vocal command to Google Assistant or Pocket Sphinx. Again, it all seems possible.
A lot to do. All fun stuff, but it looks like I will have to learn Python.