md
Raspberry Pi and Domoticz Watchdog
April 9, 2019

It can be said that I have something of an obsession with watchdogs. There are about six posts on the subject on this site and I am working on a couple of ongoing hardware watchdog projects. I should have installed a watchdog on the Raspberry Pi that is hosting my home automation system before now, but better late than never.

There is a hardware watchdog timer built into the Raspberry Pi SOC (system on a chip). So it should be enabled to restart the Raspberry Pi if the operating system crashes. In addition, I want to monitor Domoticz to restart the system if it should stop working while the operating system remains intact. There are two pages on the Domoticz site on the subject.

The latter uses Monit to monitor Domoticz. This is a general solution which could monitor many additional services. I will look into this latter. For now, I want to implement the basics:

There is much information on the Web but some of it is out of date (as this post will be soon enough). The lesson is to try to check things out before installing anything. A case in point, the watchdog kernel module is already installed in the latest Raspbian image (Stretch Nov. 2018) and nothing needs to be done to that effect.

pi@raspberry:~ $ dmesg | grep wdt [ 0.845696] bcm2835-wdt 20100000.watchdog: Broadcom BCM2835 watchdog timer ... pi@raspberry:~ $ ls -l /dev/wat* crw------- 1 root root 10, 130 Mar 17 12:14 /dev/watchdog crw------- 1 root root 252, 0 Mar 17 12:14 /dev/watchdog0

However it is not operational. To check this, I ran the "forkbomb" script.

nestor@domo:~/wdt $ cat forkbomb.sh swapoff -a :(){ :|:& };: nestor@domo:~/wdt $ chmod +x forkbomb.sh nestor@domo:~/wdt $ sudo ./forkbomb.sh

After a while, the Raspberry Pi froze, Domoticz ceased functioning and I could no longer open an ssh session. The only option was to turn the off the power and then power the Raspberry Pi back up. To recover from such a situation, install the watchdog service and edit its configuration file.

pi@raspberry:~ $ sudo apt install watchdog ... pi@raspberry:~ $ sudo nano /etc/watchdog.conf
... max-load-1 = 24 ... watchdog-device = /dev/watchdog watchdog-timeout = 15

The first two lines were already in the file but they were comments, so the leading "#" at start of those two lines to make it effective. The third line, about the timeout, is necessary because watchdog sets the timeout to 60 seconds and the Raspberry Pi watchdog timer only supports a 15 second timeout. If the line is not put in then the following error will be encountered.

Apr 09 15:29:39 domo watchdog[1336]: cannot set timeout 60 (errno = 22 = 'Invalid argument')

Many thanks to Florian Harr for this fix. Next, start the watchdog daemon and verify that everything is working correctly.

pi@raspberry:~ $ sudo systemctl start watchdog.service pi@raspberry:~ $ sudo systemctl status watchdog.service ● watchdog.service - watchdog daemon Loaded: loaded (/lib/systemd/system/watchdog.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-04-09 15:29:39 ADT; 7s ago Process: 1334 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/watchdog $watchdog_options (code=exited, st Process: 1330 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watchdog_module}" = "none" ] || /sbin/modprob Main PID: 1336 (watchdog) CGroup: /system.slice/watchdog.service └─1336 /usr/sbin/watchdog ...

Running the "forkbomb" again showed that the watchdog is effective. After a while the open session froze, but eventually, the Raspberry Pi rebooted and a new ssh session could be opened. Looking at the log it was possible to see when the watchdog bit.

pi@raspberry:~ $ cat /var/log/syslog ... Apr 9 15:38:05 domo watchdog[1958]: watchdog now set to 15 seconds Apr 9 15:38:05 domo watchdog[1958]: hardware watchdog identity: Broadcom BCM2835 Watchdog timer Apr 9 15:38:05 domo watchdog[1958]: loadavg 27 6 2 is higher than the given threshold 24 18 12! Apr 9 15:38:05 domo watchdog[1958]: shutting down the system because of error 253 = 'load average too high' Apr 9 15:38:05 domo watchdog[1958]: /usr/lib/sendmail does not exist or is not executable (errno = 2) ...

Interestingly, it looks like watchdog could send a message about the shutdown using sendmail which could be the answer to my third desire. I used a different approach as will be seen later. For now, let us look at how to monitor Domoticz. I followed the lead in Setting up the raspberry pi watchdog but without turning on the Domoticz log. Instead the Linux trick of "touching" an empty file to update its last modified date on a regular basis will be the way to "feed" the watchdog. All it takes it a simple Luas script that will be executed every minute by Domoticz.

pi@raspberry:~ $ nano domoticz/scripts/lua/script_time_domotizAlive.lua
-- Updates the access time of file /tmp/domoticz.alive -- once every minute. The watchdog service will reboot -- the machine if the time stamp of the file does not -- change over 5 minutes. commandArray = {} os.execute('sudo touch /tmp/domoticz.alive') return commandArray

Check the file time on a regular basis to ensure that it is updated every minute.

pi@raspberry:~ $ ls -l /tmp total 4 -rw-r----- 1 root root 0 Apr 9 16:36 domoticz.alive ... pi@raspberry:~ $ ls -l /tmp total 4 -rw-r----- 1 root root 0 Apr 9 16:37 domoticz.alive ...

The watchdog configuration file has to be updated. Two lines at the top need to be changed.

pi@raspberry:~ $ sudo nano /etc/watchdog.conf
file = /tmp/domoticz.alive change = 300

Stop and restart the watchdog service and then wait over five minutes (300 seconds) to ensure that the system is not rebooted. Then stop the Domoticz service and the Raspberry Pi should be rebooted in about five minutes.

pi@raspberry:~ $ sudo systemctl stop watchdog.service pi@raspberry:~ $ sudo systemctl start watchdog.service ... wait 10 minutes - nothing should happen pi@raspberry:~ $ sudo systemctl stop domoticz.service ... wait at most 6 minutes, the system should reboot

Heed the warning about reboots. [b]e sure to remember this when stopping the domoticz service by hand!.

So what about email notification? Again, I will use the approach proposed in the first reference. It simply sends an email each time the Raspberry Pi is rebooted. This is simpler than having watchdog do it. I am sure the latter is possible and it would then be possible to send a different email pointing out the reason for the reboot. Something to do later.

In section 3. Mail Alert Using syslog of a recent post I modified a short Python script to send a set email. I decided to reuse that script making it a bit more versatile. This avoids installing sendmail which does look like a rather formidable task. Here is the Python 3 script.

#!/usr/bin/python3 # coding: utf-8 # email smtp server credentials SMPT = '--your-smtp-server-url--' SRC = '--your-smtp-user-name--' PWD = '--your-smtp-password--' PORT = 465 # usual # setup a default message TGT = '--your-email-address--' OBJ = 'Alert' MSG = 'Raspberry Pi rebooted' # Import smtplib for the actual sending function import smtplib, ssl # Import the email modules we'll need from email.mime.text import MIMEText # Allow for command line options to set the subject, target email and message import argparse parser = argparse.ArgumentParser(description='Send short email.') parser.add_argument('-s', '--subject') parser.add_argument('-t', '--to') parser.add_argument('-m', '--msg') args = parser.parse_args() if args.subject: OBJ = args.subject if args.to: TGT = args.to if args.msg: MSG = args.msg # debug arguments #print(args) #print('OBJ', OBJ) #print('TGT', TGT) #print('MSG', MSG) #exit # Create the message msg = MIMEText(MSG) msg['Subject'] = OBJ msg['From'] = SRC msg['To'] = TGT # Send it context = ssl.create_default_context() with smtplib.SMTP_SSL(SMPT, PORT, context=context) as server: server.login(SRC, PWD) server.sendmail(SRC, TGT, msg.as_string()) # ref: https://realpython.com/python-send-email/

Downloadable version: pymail.

The script is saved as the file pymail in the pythons directory of the pi home directory. The last bit is to add a cron task to send the email at reboot time.

pi@raspberry:~ $ crontab -e
... @reboot sleep 60 && /usr/bin/python3 /home/pi/pythons/pymail @reboot /home/pi/.local/bin/webcam-streamer start && sleep 10 && /home/pi/.local/bin/webcam-streamer stop ...

As can be seen there was already a task performed at reboot but it does not matter if another is added. I did impose a minute wait period before sending the message. That may be excessive, but initially there were problems in sending the email.