There was a little problem monitoring the CPU temperature of a network attached storage device based on an up-to-date version of openmediavault. The temperature widget was stuck at 27.8 °C as shown on the right, which was obviously wrong. The fix was easy enough as will be shown, but that is only part of the monitoring done. It is more useful to transmit the CPU temperature to a home automation server at regular intervals. While doing that, an email notification will be sent if the temperature is over a specified threshold.
Table of Content
- Installing the CPU Temperature Widget
- Reading the Correct Thermal Zone
- Monitoring More Than One Thermal Zone
- AMD Ryzen CPU
- Remote Monitoring
Installing the CPU Temperature Widget
First thing first. How is that temperature widget added to the dashboard? That requires an omv-extras plugin to be installed.
- If not already done, install OMV-Extras as explained in OMV-Extras for OMV5, OMV6 and OMV7.
- In System » Plugins scroll down to openmediavault-cputemp 7.x.x (currently: 7.0.2) and click on it.
The entry background will change to a yellow colour. Click on the install icon which is a downward-pointing arrow.
Check the Confirm box and click on the red Yes button.
The installed plugin must now be enabled.
-
Open the user menu by clicking on the user icon at the top right of the OMV web interface and then click on the Dashboard menu item.
-
Check the
CPU Temp
box to enable the widget.
The CPU temperature widget will then be displayed in the dashboard. Some may be lucky and the widget will display the correct temperature. As already explained this was not the case with our Intel N5105 based NAS.
Reading the Correct Thermal Zone
As far as I know, it is necessary to open an SSH session on the NAS to change the ovm-cputemp
settings. This means that the SSH service has to be enabled in OMV.
Running a little bash script that I wrote with the help of the usual suspects on the Web, the problem became obvious.
The plugin is reading thermal zone 0, while the needed CPU temperature is available in thermal zone 1. The zone can be changed with an environmental variable.
Be patient with the Salt prepare
command, it took 17 seconds without any feedback. Once all the steps are followed, the CPU temperature widget displays a more accurate value.
Monitoring More Than One Thermal Zone
In April 2024 additional temperature widgets were added in version 7.0.1 of the omv-cputemp
plugin. Since there is no other thermal zone on our NAS, I will illustrate this functionality by installing thermal zone 0 again.
In truth, thermal_zone0
is the default value and it would not be necessary to set it as above. Now, it's a matter of enabling the second widget in the dashboard settings.
Here is the result.
I had to play around to get the two temperature widgets to be next to each other. Perhaps that would happen automatically when opening a new web client after closing all web clients that were open beforehand. I found that enabling the CPU widget and then removing it did the trick.
When removing a temperature widget, the corresponding environment variable should probably be removed also.
By the way, just running these commands can easily push the Intel N5105 CPU temperature to 58° C.
AMD Ryzen CPU
If the CPU is an AMD Ryzen, then thermal zones will probably not be updated correctly. The lm-sensors
package may provide the needed information. And it will be necessary to write some scripts to get the data from the sensors
utility that comes with the package. It's explained in Guide - Custom cpu temp script for openmediavault-cputemp plugin by Aaron Murray (ryecoaaron).
Remote Monitoring
The fact is, the OMV Web management page is very rarely consulted. Most of the time, the NAS does its work quietly in the background and I never really think about checking the CPU temperature and so on. And when I do, it's to used the Domoticz home automation system that shows the NAS CPU temperature and power usage.
Furthermore, Domoticz logs the values written to those sensors and produces graphs that are entertaining and that may yet prove useful.
The NAS is powered from an Itead Sonoff Pow smart Wi-Fi switch. The open source Tasmota firmware on the switch transmits data about power consumption to Domoticz through an MQTT broker. This is all easily put together and relatively low cost. It or something similar has been described in previous posts.
What follows are the bits of "glue", to update the NAS_Temperature virtual sensor in Domoticz. This is quite easy to do with Domoticz API/JSON URL's (sic, I have doubts about the apostrophe), but it could also be done with MQTT messages.
Python Scripts
Transmitting the CPU temperature is done with a couple of Python scripts that are executed at regular intervals by a cron task.
The first Python script is called cputemp
.
Hopefuly, that script is straight forward. It reads the value in /sys/class/thermal/thermal_zone1/temp
and then divides it by 1000 to get the CPU temperature in degrees centigrade. This value is passed on to the Domoticz server in an HTTP request. Furthermore, if the temperature is above the threshold alterTemp
, then an e-mail alert is sent. Here is my rather simple module for sending emails, pymail.py
.
Of course, the values of SRC, SMPT, PORT
and PWD
will have to be adjusted. It would have been possible to have Domoticz take care of the notification when the temperature is above a threshold, but I thought it better to send the email alert directly from the NAS. While writing this, it seems that it might be a good idea to have warnings coming from Domoticz also, but only if the problem persists over a longer period to avoid flooding my email account.
These two scripts and the `get_thermal_zones` script are available in a GithHub Gist: sigmdel/cputemp.py.
Cron Job
Initially, a cron job was set up to run every five minutes.
That turned out to be an unwise choice. There would be no problem 11 times out of 12 each hour. However there was a error, a timeout, when cputemp
was run at the start of each hour. Here is one of the email messages sent every hour by cron
.
As can be seen from the power usage graph, these timeouts corresponded to increased energy usage and presumably higher temperatures. While trying to make sense of these, obviously, non-random errors, I remembered that automatic backups of the Domoticz database were enabled. Listing the hourly backups showed that they occurred at the start of the hour.
Looking at the approximately 40 logs messages between 8:59 and 9:01 clinched it. Here are the three pertinent messages.
The HTML request sent to Domoticz by the cputemp
script at the turn of the hour was timing out because Domoticz was busy backing up its database. The spikes in power usage show that the NAS was getting hot under the collar waiting for a reply from the otherwise busy Domoticz server. The solution was to change the timing of the cputemp
cron job.