Part 2 - Asynchronious Web Page Updates
Part 3 - Better User Experience
Part 4 - Commands - version 0.0.8
At this point of the project, improving the user experience is the priority. To that end, a private logging module is introduced. Hopefully, it will provide more meaningful messages since they will be posted to three (and eventually four) destinations or facilities. One of those facilities is a Web console which will be added to the Wi-Fi switch Web interface. Again this is a feature borrowed from Tasmota. The effort to make all functions non-blocking continues and, to that end, the hardware will be polled using a hardware timer. That way, the response to button presses remains the same no matter what the microcontroller is doing. A command interpreter is introduced. At this stage, the concept of user managed settings becomes a possibility. However, to make it operable it is necessary to have a mechanism to save the configuration to non-volatile memory. Only then will the settings be preserved across reboots.
The source code for the PlatformIO projects / Arduino sketches presented in this post can be found in a GitHub repository: sigmdel/xiao_esp32c3_wifi_switch. These developments, spread over four projects / sketches, named 07_with_log
, 08_ticker_hdw
, 09_with_cmd
and 10_with_config
continue on from 06_sse_update
. Hopefully, the following tree makes obvious the non-linear development path followed.
The dotted line shows where the use of the JavaScript asynchronous XMLTHTTPRequest was added to Server-Side Events as discussed in the final remarks of part 2.
Table of Content
- Adding Logging Facilities
- Web Console
- Non Blocking Drivers
- Adding a Command Interpreter
- Managing the User Configuration
- Improved Wi-Fi Handling
Adding Logging Facilities
Version 07_with_log
adds a private logging functionality. The aim is to replace the built-in ESP32 HAL log functions, accessed with macros such as ESP_LOGV
, found in the esp32-hal-log
module, with something more useful for the tasks at hand and perhaps better suited for deployement in a real device later on. Again, taking clues from Tasmota, log messages should be visible on multiple log facilities. Currently four types of destinations are envisioned, but only the first three are currently implemented.
- The serial interface of the ESP32-C3 which can be viewed from the serial monitor of the IDE.
- A syslog server reachable over the network
- The console of all Web clients connected to the XIAO hosted Web server.
- All MQTT clients subscribed to the appropriate topic of an MQTT server.
Log messages have priority levels assigned to them when created in the code base. Each log facility has a threshold and only displays to messages with a priority equal or greater than the threshold. The ESP32 HAL log messages show the module and line number of its own code. Our version is much less granular, it shows a 3-letter tag identifying the module in which it is found. Presumably, the error message will be detailed enough to make it easy to find the log message in the code. Here is a typical log messages when running 07_with_log
.
The time code at the start is the number of milliseconds elapsed between when the XIAO was last rebooted and when the log message was added to the log. The raw milliseconds are converted to an hours:minutes:seconds:milliseconds
format. This is followed by the module tag in square brackets and then the actual message concludes the entry.
When working on version 08_ticker_hdw
, I felt that it would be useful to show the log level of the message if only to help in rationalizing debug and information messages that often near each other. Here is the typical log message in after implementing this change.
Initially, the full complement of 8 levels as found in the Linux log system was used. But when I decided to display the log level and took into consideration the need to parse these level in the upcoming command interpreter, I selected to have only three log levels.
Here is the declaration of the log tags that identify the source of a log message.
The three letters at the start of the comment to the right of a tag is the string that will be displayed before the message and the file name after shows in which module the log message originated. Another tag was added in 09_with_command
.
This log is implemented as a FIFO queue. While most queues refuse to accept new entries when full, the version used will always accept new messages into the queue, deleting older entries when necessary. The objective behind this design is to avoid anything that blocks execution of the primary function of the device. Dropped log messages is not a heavy price to pay. At least that's true for me; I almost exclusively use the log while developing the firmware.
Since the esp32-hal-log
is no longer needed, it is replaced with the logging
module. That means that all logging messages had to be converted to one of four addToLogxxxx()
functions that push the message onto the FIFO queue. Only the most basic function, sendToLog()
, actually pushes a message onto the queue. All the other "massage the message" and passes that on to it.
Note how two arguments before the message are the log level and tag parameters discussed above. The second function, addToLogf()
, takes a format string containing text and instructions on how to display the value of the variable number of arguments that follow. It is similar to the standard C function printf()
. The third function, addToLogP()
is the same as the first except that the message is stored in flash memory instead of RAM. The message string is stored into flash memory when it is declared in the code with the PSTR() macro. In addToLogf()
, the message is copied from flash to RAM with strcpy_P()
. As far as I know, this use of PROGMEM is not truly useful with ESP32 processors (pgmspace.h contains nothing but typecasts) but I have kept it for backward compatibility with ESP8266. The final addToLogPf()
function is just a combination of the previous two. In practice, the last two functions are the most used.
The function int sendToLog(void)
pulls the oldest entry in the queue and then will send it out to the various facilities which will then display the message if its log level is equal or higher than the facilities log level threshold. One of the advantages of using a FIFO queue to buffer messages is that messages from within an interrupt service routine and in setup()
can be logged even before the serial port is up. The actual sending of log messages is done by safely calling sendToLog()
in the loop()
thread. There is one other function whose purpose will become clear in the next section.
The only other change to the log utility is done in 10_with_config
, when a flushLog()
function is added. At that point is seemed appropriate to ensure that all the log messages be sent out to the various log facilities just before reseting the XIAO.
Web Console
The addition of an HTML page that displays the private log messages in the Wi-Fi switch Web interface is an important step. As can be seen from the screen shot below, that new page can be opened by clicking on the 07_with_log
.
Adding the button to the root page was a simple matter of adding another form under the existing toggle
form in the the html_index[]
string containing the HTML source code of the root page.
Here is the HTML code of the new page.
The console window is a text-area
with the log
id and whose content is defined as the placeholder %LOG%
. As with the sensor data, the placeholder will be replaced with current data by the template processor when the page is served to a web client, but in this instance the data will be the string returned by logHistory()
. The script at the end of the of web page is similar to what was done previously in the index page, but here there's only on one important even listener added. It appends the text it receives to the content of the console window text-area. The event identifier is 'logvalue' which is the identifier used by sendToLog()
when it fires a server-sent event when broadcasting log messages.
Of course, a handler for requests for the /log
page has to be added to the web server. The details have been moved out of the setup()
function in main.cpp
into a weberserversetup()
function in a new file named webserver.cpp
.
That new file also contains the template processor with the additional substitution line discussed above.
It would be best to report that this works flawlessly, but that is not the case. There is an annoying stuttering; a few log messages get printed twice most don't. So far I have not discovered a pattern to that could help pinpoint the reason, but I think that the log FIFO cannot be blamed as it does not occur in other log facilities. It will be seen in future versions that I have tried to find the cause for this behaviour, but to no avail.
Non Blocking Drivers
Decoupling the creation of log messages from the displaying them in various log facilities does improve the stability of the firmware but it was never a major source of blocking behaviour except in one instance. In trying to find out why there were problems in the handling of HTTP POST requests, I added a call to logFlush
after each message was sent to the serial port. That trick did help in pinpointing where the problem occurred, but it interfered greatly with the following log facilities, notably in the Web consoles, so it was commented out very quickly.
In previous versions, the setup()
function could fail because of an endless loop in the hardware.cpp
initSensor()
function should the DHT20 temperature sensor become defective. This was a simple error to correct.
At most five attempts are made to initialize the sensor and if that is not sufficient then the hasTempSensor
boolean is set to false
. In that case, no reading of the temperature and humidity sensor will be performed. That takes care of possible blocking behaviour when the XIAO is reset, but there remains the possibility of a malfunction occurring some time after the sensor has been successfully initialized. This should be investigated.
As for the light sensor, there is no device initialization as such and getting data is just taking an analog voltage measurement. Reading the input pin will always yield a value which might be spurious if the sensor is absent or broken but which will never result in a blocking infinite loop.
With the hardware sensors more or less covered, let's move on to other sources of blocking behaviour. Most demonstration programs show a rather simple-minded approach to connection to the wireless network either going into a possibly endless loop or simply restarting if the connection could not be made soon enough which in itself would be another endless loop unless some sort of Wi-Fi manager is added. Before version 07_with_log
this project was no better.
The setup()
function will never be completed if the connection to the network cannot be made for whatever reason and consequently the Wi-Fi button will be inoperable. As stated before, the primary function of the device will be as a switch and it must "always" work in manual mode and not depend on the local area network. The approach in the setup()
function in 07_with_log
is totally different.
There are numerous small modifications here, but the important thing is that the Wi-Fi connection is started with Wifi.begin()
, but there is no waiting for the connection to be established. We just have to ensure that the rest of the firmware can handle the absence of a Wi-Fi connection. For example, the sendHttpRequest()
function in domoticz.cpp
now exits at the very start if the there is no Wi-Fi connection.
Along the same lines, the WiFiModule()
is introduced. It checks if the state of the Wi-Fi connection has changed and it updates a flag named wifiConnected
and the system time at which the change occurred in connectionTime
in response and finally logs the change. In a later version, it will have some added responsibility, but initially this is a very simple subroutine.
Incidentally, the need for the two second delay waiting for the serial port to come up has not been removed in setup()
, after all that work queuing log message is not initially obvious. Messages are accumulating in the log queue all throughout the execution of setup()
, from even before the Serial port is opened. That operation is non-blocking so it returns immediately and in the absence of a delay, it completes the setup and enters the loop()
before the Serial port is functioning. So the sendLog()
function which is one of the few things done in loop()
would be poping messages from the queue and sending them to a non-operating Serial port that would simply ignore them. Hence the two second delay that has so far proven long enough to ensure that the Serial port is up by the time the loop()
function starts.
Given these improvements, there remained an egregious latency problem when updating Domoticz. At times, the XIAO responded so slowly to button presses that it felt like it was ignoring the button press. And it was, but with flawless logic, I attacked the symptom instead of the cause. I reasoned that the latency would disappear by checking the state of the I/O pin connected to the push button at fixed regular intervals using a time interrupt. This was easy to accomplish by removing the call to checkHardware
from the loop()
and attaching it to a ticker
in the initHardware
in hardware.cpp
.
Generic Arduino ticker libraries such as SimpleTicker and Ticker poll the system tick counter and need to be "pumped" in the loop()
function. Using something like that would not change anything. However the ESP32 Ticker library creates objects that use hardware timers and, because they trigger on the underlying timer interrupts, they should work independently of the loop
thread. That was the idea, but it did not work as planned.
In my defence, the source of the problem became much clearer when I set up a separate wireless network and a second Domoticz instance on that system independent from my home automation system. It then became possible to start and stop the Wi-Fi network or the Domoticz server at will. Using this test environment and improving the log messages showed that the true source of the problem lay in HTTP requests made when Domoticz was offline or when the URL to the server was incorrect. The request would block until a timeout is reached and, crucially, the Ticker
hardware timer used to monitor the state of the push button is disabled during the network operation. Perhaps in dual core ESP32 the Wi-Fi operations and the hardware timers can be run on separate cores, but the point is moot when it comes to the single core ESP32-C3 (ESP32-S2 and ESP8266 for that matter).
I assumed at that point that the solution was to make each HTTP request in a separate thread or task. A search for an asynchronous HTTP request library yielded a couple of results, but unfortunately, I could not get them to work. I am not confident enough in my work to report on the source of this failure which could be within the relatively new ESP32 Arduino core for the ESP32-C3, within the async HTTP libraries themselves, or, more likely, the result of my incompetence. So the immediate measure adopted in 07_with_log
was to set the HTTP request timeout to the lowest possible value. There is a CONNECT_TIMEOUT
macro in domoticz_data.h
(in the template domoticz_data.h.template
in the distribution) which makes it easier to set that value. The default value is 5 seconds (5000 ms), the lowest possible value is 1 second (1000 ms).
Needing to get on with the project, I tried another mitigation strategy in 08_ticker_hdw
based on the same FIFO queue used for the logging system. This time, the various updateDomoticzXXXX()
commands in domoticz.cpp
do not call sendHttpRequest(String url)
directly. Instead the complete request URL is stored in FIFO queue called urlRing
. A new function, sendRequest()
, pulls the oldest URL in the ring and it has sendHttpRequest()
take care of the details. The new function is called in the main.capp
loop()
function. To my surprise this worked: the relay could instantly be turned on or off repeatedly with the push button even when Domoticz was offline and the HTTP request timeout was stretched to 10 seconds. My, perhaps invalid, explanation for this success is that I was wrong. The HTTP request does not implicitely block all other running tasks including the hardware timers, but only the task in which the request is running. In the original version, updateDomoticzXXXX()
which called sendHttpRequest()
ran within the Ticker
timer task. So until the request completed correctly or timed out, the timer was disabled. By using the ring buffer, two tasks are being run in parallel and almost independently. The sendHttpRequest()
still blocks its task but that is now the main loop()
task levaing Ticker
able to run and respond to push button activity and read the sensors on time. Well, that's my story and until told otherwise, I'll stick to it.
Another notable change was made in 08_ticker_hdw
: the return of XMLTHTTPRequest. In the previous version, the relay was turned on or off from the Web interface with a form button that would perform the /toggle
action with the GET
method. In plain English, when the button was activated a GET
HTTP request with a page named /toggle
would be sent to the web server. The latter would respond by toggling the LED and then sending back the index HTML page. It worked, a little to well. If the index page was opened in more than one client, one could bet a very entertaining flashing LED display as pages were being updated with sender-side events. Perhaps that could have been eliminated by using the POST
method, but I ran into problems with that. The solution was to use an XMLHTTPRequest that does not require a response from the web server.
Version 08_ticker_hdw
of the firmware thus marks a milestone because of the improvements described above. It is also where the hardware abstraction is finalized. The header file hardware.h
no longer contains references to I/O pin numbers which are moved into hardware.cpp
. Presumably from that point on, there will never be a need to change the header file when different hardware is used or the connections are modified. Also names have been changed. For example String ledStatus
becomes String RelayState
. The LED was used as a proxy for a relay, but using that name would be confusing if latter a visual feedback LED is added to the project. Similarly String Light
is renamed String Brightness
to distinguish between the relay controlling a light and the sensor measuring the ambient light or brightness.
Adding a Command Interpreter
The main raison d'être for commands is to allow run-time modification of settings such as the time between updates of sensor data or the name of the Wi-Fi network. To that end, the Domoticz data, stored in domoticz_data.h
, was added to the config_t
structure in config.h
and more fields were added. Here is the augmented configuration header file.
Of course, the default values for the added fields were added in user_config.h
. As before the content of that file will need to be adjusted to the local situation, user_config.h.template
can be used as a guide. One would think that the content of secrets.h
would also be moved into the configuration structure. It was not done, because at this juncture, I still wanted to use the built-in Wi-Fi credential management capabilities of the ESP. More on that latter.
In 09_with_cmd
, a command is sent to the interpreter either from a terminal connected to the serial port of the XIAO or through a input box on the Web interface. Sending commands through MQTT messages will be implemented much later. Commands are not silent, most output some information. Indeed, the default behaviour of most commands when invoked with no parameters is to display the current value of a configuration setting or the current status of some aspect of the Wi-Fi switch. Displaying values actually means sending a log message to all the logging facilities as explained above. Here is an example of a command sent from the Web console.
And here is the log output from the command interpreter as seen in the Web console or on a terminal connected to the serial port.
This warrants a few observations. First, a command is actually a list of semicolon separated commands. I deliberately wanted to avoid an explicit Backlog
type command as found in Tasmota. There is a help
command which lists all available commands. It also takes an optional single parameter, a command name, which if included will display the syntax of that particular command as seen in the help config
above. Not all commands have been implemented, config
is among those that will be completed later. The log level of all the messages sent by the command interpreter is inf
, so setting the threshold level of a log facility to ERR
avoids filling that facility with probably unnecessary messages. On the other hand, the serial and Web logs should have a threhold level of inf
or dbg
if they are used as the source to send commands, otherwise the interactive aspect of the interpreter will not be very visible.
In principle, the command interpreter is very simple. Here is the header file.
Let's look at how the command string is created and sent on to the doCommand
function. A function called inputModule()
in main.cpp
builds the command string, one character at a time read from the serial input. When a line feed is entered by pressing on the Return key, then the stringComplete
boolean is set true and the module will send the input string to the command interpreter and then clear the input string to prepare for the next command.
The only supported editing is the backspace key. It would be too much work to handle the cursor keys and overwrite and insert modes and so on. Changes to the platformio.ini
configuration file are required. Two additional [env]
values are needed so that the user input can be visible in the monitor window and the line feed character is sent when the Return key is pressed.
When using the Web console to send a command, the user enters it in an input field of a form. A "change" event listener is attached to the input element. When the event is triggered, it performs the sndCmd()
function which sends the command string as a query using an XMLHTTPRequest
.
I would have preferred using a POST
request, but I have had problems handling such HTTP requests with the ESP32-C3. The web server handles these requests by handing the command to the doCommand()
function.
POST
instead of a GET
request. For one thing, I finally found a way of handling such requests that does not cause problems. For another, the ESP32 Arduino core is constantly improved and the real source of the problem may have already been identified and a solution already proposed. The faint of heart should not look at the code in commands.cpp
, it is a mess. There are 8 commands that take parameters in this first version of the interpreter. The parameters of some commands are quite similar, yet it looks like a different parser was built for each command. That is not far from the truth as I struggle to find the best way to go about parsing the parameters. If I settle on the best approach, it will make sense to use it everywhere and look into the possibility of refactoring the code.
There are 13 commands: config(*), dmtz, help, idx, log, mqtt(*), name, restart(*), staip, syslog, time, topic(*), version
but the four marked with an asterisk basically do nothing and report some sort of error as seen above. It is important to point out that the number of commands, their name and their parameters are subject to change and they will be modified as the project is developed. The next post in this series, Part 4 - Commands documents all the commands, but it will always be up to date. Currently, it corresponds to version 0.0.8 of the firmware as found in 12_with_mqtt
.
Managing the User Configuration
While the previous iterations of the firmware added user modifiable configuration along with a command line interpreter, it lacked essential capabilities to implement a useful configuration module. All the configuration data was saved to RAM which meant that any change to the default values would be lost after a reboot. Not surprisingly, the main changes in 10_with_config
are found in the config
module. The header file contains three function declarations that handle storage of the configuration data.
The purpose of the useDefaultConfig()
should be self explanatory. When invoked it will return all configuration settings to the values defined in user_config.h
. The function saveConfig()
will save the current configuration to non volatile storage (NVS). If the function is invoked without a parameter or if the force
parameter is set to false
, then saveConfig()
only saves the configuration if it has been changed. This is determine by comparing the checksum of the current configuration with that of the configuration in NVS. As expected if force
is set to true
, then the current configuration is saved to NVS no matter if the current configuration checksum is the same or different than that of the saved configuration.
The loadConfig()
function which previously existed now behaves differently. When invoked, it now tries to read the last saved configuration using the loadConfigFromNVS()
function. The latter reads the raw data from storage and it then performs three checks on the data. First, if verifies that the size of the data read from NVS corresponds to the size of the config_t
structure. If that test is passed, then it verifies the configuration version number corresponds to the expected version number. If that second test is passed, then a verification of the checksum of the raw data completed. If all three tests pass, then loadConfigFromNVS()
returns true
and loadConfig()
does nothing else. If any test fails then loadConfigFromNVS()
returns false
and in that case, loadConfig()
will load the default configuration.
Note how the configuration is not saved on every change. Instead I opted for an autosave feature meaning that by default the configuration will be saved to NVS if modified when the ESP is shutdown. My first implementation of this was to set up a shutdown handler and register it with the esp_register_shutdown_handler
function in setup()
. However, the radio is turned off by the time the shutdown handler is called so that it was impossible to ensure that the log had been fully transmitted. Consequently an explicit espRestart()
function was created instead.
The True
when the configuration is loaded, hence the default behaviour. The flag can be set to false
with the config auto off
command. That meant that it would be possible to select what the big red Web interface Restart
button did with respect to saving the configuration. It is also possible to restart the XIAO without saving the configuration with the restart 2
command. In retrospect I found this too complex for nothing; if the user uses the command interpreter to disable automatic configuration saving and then clicks on the restart button, it would actually be faster to enter the restart 2
command. In a future version the autoSave flag will be removed.
Right now I would suggest that any meaningful change to the user settings be explicitly saved with the config save
command. This is because any changes to the configuration will be lost if there is a panic restart of the ESP because of an exception or a power outage.
Improved Wi-Fi Handling
Network handling was much improved in 10_with_config
, but it was not a straightforward exercise. In version 0.0.5, the initial commit of 10_with_config
was built on the premise that it would be possible to disconnect from one Wi-Fi network and connect to another without restarting the ESP. It would not be necessary to save the Wi-Fi credentials in the user managed settings, the ESP would handle that on its own. Similarly I wanted to be able to switch from a dynamic, DHCP assigned, IP address to a fixed IP address with a single staip
command without restarting the ESP. Here was my working mechanism for accomplishing this when the staip
command was used to change the station IP address.
- config the Wi-Fi station to use a DHCP assigned address
- disconnect the Wi-Fi station
- Wi-Fi auto reconnect would take care of reconnecting to the network
- when the
WiFiModule()
noticed that Wi-Fi was reconnected, it would test for a static user set IP address. - if no, then there was nothing else to do.
- if yes, then it would test if the static IP address and gateway address were on the same subnet as the DHCP assigned address
- if not, then it would log an error message and do nothing else
- If yes, then it would reconfigure the Wi-Fi station to use the static IP address
I liked this approach because it would have protected me from stupid mistakes such as entering a bad static IP address not available on the Wi-Fi network and hence losing contact with the ESP. Unfortunately, I hit a wall because it would work as long as the command was issued from the serial terminal, but if the command was entered in the Web console, the ESP would crash and on restart it would not know what to do. After struggling for days with this, I decided it was time to move on to plan B, which actually meant reverting to things done in the past.
The Wi-Fi credentials were added to the user settings in version 0.0.6 of 10_with_config
. A wifi [-d] | [<ssid> [<pswd>]]
command was added. That command and staip
merely update the configuration, they do not disconnect Wi-Fi or anything. When the ESP is restarted, it will use the (presumably) saved configuration settings to establish the Wi-Fi connection. Consequently,
staip -x; wifi NEW_NETWORK_SSID NEW_NETWORK_PASSWORD; restart
will restart the device and obtain a dynamic IP address once connected to the specified Wi-Fi network.staip 192.168.1.76 192.168.1.1 255.255.255.0; wifi NEW_NETWORK_SSID NEW_NETWORK_PASSWORD; restart
will restart the device which will then attempt to connect to the specified Wi-Fi network and set the specified static IP address.
This works no matter the source of the command. However, when connecting, no verification is made to ensure that the specified IP address and gateway are valid. It will be necessary to go ahead and add a Wi-Fi manager of some sort in case the Wi-Fi credentials are wrong. Some sort of mechanism will need to take care of an incorrectly specified static IP address, gateway or subnet mask. At the time of writing this, I know that a very long button press will be used to clear the static IP address and the Wi-Fi credentials so that on restart a Wi-Fi manager will come up and the mistake can be corrected in the Web interface. Of course that requires physical access to the device push-button, which is not a problem for a Wi-Fi switch meant to be physically activated, but which would be a problem if the Wi-Fi switch as not used as a mechanical device at all. The solution there would be to start the Wi-Fi manager after the gateway refuses to acknowledge pings for a specified time. This is something I have already done in Asynchronous Ping for an ESP8266 Internet Watchdog.
Finally, the AsyncElegantOTA library by Ayush Sharma. It only requires an additional lines of code in webserver.cpp
.
At this point, the development of the firmware is nearing completion. The two major additions still required are the Wi-Fi manager to start a Wi-Fi access point that will handle setting up the client Wi-Fi connection and supporting MQTT.