There have been unwarranted reboots of the router by the original router watchdog. The exact source of the problem is not entirely clear. However looking over the code, it struck me that using a blocking (or synchronous) ping library to check the status of the Internet connection was not the proper approach. In this post, I will be presenting an example project that monitors if the Internet can be reached using the AsyncPing library by akaJes
. Just about the same code will be used to patch the router watchdog.
The starting point for my example PlatformIO project is the ping_interval.ino example by akaJes
. By luck, that sketch does almost the same thing as needed for an Internet watchdog. At regular intervals it sends out 3 pings to 3 sites. These are non-blocking operations; as soon as the ICMP requests are sent, control returns to the normal loop()
function without waiting for replies from the pinged sites. When ping replies do come in, program flow is interrupted to report which site replied.
In the Internet watchdog, only a single ping is sent out at regular intervals. The host or target of the ping request is chosen in round-robin fashion. The absence of a reply from a single target host will not be construed as a network failure. Instead the Internet is deemed down when no reply has been received from any of the target hosts for a specified number of seconds. It seemed like a good idea to collect some statistics about the number of replies compared to the number of ping requests sent out. For debugging and information purposes, a report of these statistics is printed and targets that do not reply reliably are identified.
The example program also includes a user ping function because the router watchdog had such a function mostly for the purpose of debugging and checking that the ping targets were correctly specified.
#include <Arduino.h>
#include <ESP8266WiFi.h>
#include <Ticker.h>
#include "AsyncPing.h" // https://github.com/akaJes/AsyncPing
#define WIFI_SSID "a_SSID" // Wi-Fi network SSID (network name)
#define WIFI_PSK "a_psk" // Wi-Fi network pre-shared key (password)
#define INTERNET_LOST 300 // seconds since last received ping before the Internet connection is deemed lost
#define PING_INTERVAL 10 // seconds between ping requests
#define REPORT_INTERVAL 120 // seconds between report on ping sent and received counts
#define PING_SAMPLE_SIZE 10 // minimum ping count for testing for targets that don't reliably reply, should be 10+
#define UNRELIABLE 7 // target unreliable if response rate is less than UNRELIABLE/10
#define RESET_COUNTER 90000 // ping sent and received counts reset after RESET_COUNTER requests sent to all targets
#define TARGET_COUNT 3 // number of ping targets
// target hosts that will be pinged on a regular basis, can be identified by host name or IP address
const char* pingHosts[TARGET_COUNT] = {"google.com", "bringggx.carbs", "8.8.8.8"}; // with one bad host name to test setupTargets()
/*-------------------------------------------------------------------------------------*/
IPAddress targets[TARGET_COUNT]; // array of valid IP addresses of ping targets
int pingSentCount[TARGET_COUNT]; // number of pings sent to each target
int pingRcvdCount[TARGET_COUNT]; // number of replies received from each target
int hostsIndex[TARGET_COUNT]; // reverse index from last 3 arrays to pingHosts array
int targetCount = 0; // number of valid IP addresses in targets array
int pingIndex = 0; // index of next target to ping
unsigned long lastValidPing = 0; // the last time a ping was received from a target
AsyncPing targetPinger; // object to send successive pings to the target sites
AsyncPing userPinger; // object to send ping request to a user specified host
Ticker pingTimer; // object to time the sending of pings to target sites
Ticker reportTimer; // object to time reporting on the status of sent ping requests
// print statistics about sent and received ICMP packets and warn about unreliable targets
void reportTargetStatus(void) {
if (targetCount < 1) return;
Serial.printf("\n%lu: Ping target : received / sent counts\n", millis());
for (int k=0; k<targetCount; k++) {
bool unreliable = ( (pingSentCount[k] > PING_SAMPLE_SIZE) && (pingRcvdCount[k] < (int) ((UNRELIABLE*pingSentCount[k])/10)) );
Serial.printf(" %s : %d / %d%s\n", pingHosts[hostsIndex[k]], pingRcvdCount[k], pingSentCount[k], (unreliable) ? " *** WARNING: unreliable target ***" : "");
}
Serial.println();
}
// reset the send and receive statistics
void resetTargetStatus(void) {
pingIndex = 0;
memset(pingSentCount, 0, sizeof(pingSentCount));
memset(pingRcvdCount, 0, sizeof(pingRcvdCount));
}
// pingHost URL's and IP addresses to IPAddress objects
void setupTargets(void) {
targetCount = 0;
for (int i = 0; i < TARGET_COUNT; i++) {
if (WiFi.hostByName(pingHosts[i], targets[targetCount])) {
hostsIndex[targetCount] = i;
targetCount++;
} else {
Serial.printf("\"%s\" is not a valid host name or Ip address\n", pingHosts[i]);
}
}
resetTargetStatus();
}
// send a ping request to the next valid targer IP address and increment its sent statistic
void sendTargetPing() {
if (targetCount < 1) return;
if ((pingIndex == 0) && (pingSentCount[0] > RESET_COUNTER)) resetTargetStatus();
Serial.printf("%lu: Sending ping to target[%d] %s\n", millis(), pingIndex, pingHosts[hostsIndex[pingIndex]]);
targetPinger.begin(targets[pingIndex], 1, 5000); // 1 ping, timeout in 5 seconds
pingSentCount[pingIndex]++;
pingIndex = (pingIndex+1)%targetCount;
}
// function that is called when a ping reply arrives from one of the target hosts
bool targetPingerCallback(const AsyncPingResponse& response) {
if (response.answer) {
for (int j = 0; j < targetCount; j++) {
if (response.addr == targets[j]) {
Serial.printf("%lu: Ping reply from target[%d] %s received\n", millis(), j, pingHosts[hostsIndex[j]]);
// if (millis() % 2 == 0) // remove leading // to test unreliable target report
pingRcvdCount[j]++;
break;
}
}
lastValidPing = millis(); // add leading // to test ping failure
}
return true; // done
}
// send a ping request to a specific host. ipaddress can be a URL or an IP address
void sendUserPing(const char* ipaddress, u8_t count = 3, u32_t timeout = 1000) {
IPAddress ip;
if (WiFi.hostByName(ipaddress, ip)) {
Serial.printf("%lu: Sending ping to %s (%s)\n", millis(), ipaddress, ip.toString().c_str());
userPinger.begin(ip, count, timeout); // 3 pings, timeout 1000 these are the default values
} else {
Serial.printf("%lu: Could not create valid IP address for %s\n", millis(), ipaddress);
}
}
// function that is called when a ping reply arrives from the user specified host
bool userPingerRecvCallback(const AsyncPingResponse& response) {
IPAddress addr(response.addr); //to prevent with no const toString() in 2.3.0
if (response.answer)
Serial.printf("%lu: %d bytes from %s: icmp_seq=%d ttl=%d time=%lu ms\n", millis(), response.size, addr.toString().c_str(), response.icmp_seq, response.ttl, response.time);
else
Serial.printf("%lu: no reply yet from %s icmp_seq=%d\n", millis(), addr.toString().c_str(), response.icmp_seq);
return false; //do not stop
}
// function that is called when the user ping request times out
bool userPingerFinalCallback(const AsyncPingResponse& response) {
IPAddress addr(response.addr); //to prevent with no const toString() in 2.3.0
Serial.printf("%lu: %d pings sent to %s, %d received, time: %lu ms\n", millis(), response.total_sent, addr.toString().c_str(), response.total_recv, response.total_time);
if (response.mac)
Serial.printf(" detected eth address " MACSTR "\n",MAC2STR(response.mac->addr));
Serial.println();
return true; // done (does not matter)
}
void setup() {
// setup the serial connection
Serial.begin(115200);
while(!Serial) delay(10);
Serial.println();
Serial.println();
// setup the Wi-Fi connection
WiFi.disconnect(true);
WiFi.mode(WIFI_STA);
WiFi.begin(WIFI_SSID, WIFI_PSK);
Serial.print("Wait for WiFi ");
while (WiFi.status() != WL_CONNECTED) {
delay(500);
Serial.print(".");
}
Serial.print("\nWiFi connected, IP address: ");
Serial.print(WiFi.localIP());
Serial.print(", gateway IP address: ");
Serial.println(WiFi.gatewayIP());
Serial.println("\n");
// intialize the targets[] array of IP addresses based on the given pingHosts
setupTargets();
// setup targetPinger, the targets pinger
targetPinger.on(true, targetPingerCallback);
// setup userPinger, the one-off ping to a user specified host
userPinger.on(true, userPingerRecvCallback);
userPinger.on(false, userPingerFinalCallback);
// setup the timers that will run target pinging and status reporting in the background
pingTimer.attach(PING_INTERVAL, sendTargetPing); // send a ping to a target every 10 seconds
reportTimer.attach(REPORT_INTERVAL, reportTargetStatus); // report the status every two minutes
// initialize the on-board LED
pinMode(LED_BUILTIN, OUTPUT);
digitalWrite(LED_BUILTIN, HIGH); // turn LED off (using a Lolin/Wemos D1 mini for testing)
Serial.printf("Setup completed with %d ping targets in place\n", targetCount);
lastValidPing = millis();
delay(1000);
Serial.printf("Remaining free mem: %u\n\n", ESP.getFreeHeap());
}
void blinkLED(void) {
for (int i=0; i<4; i++) { // loop limit should be an even integer
digitalWrite(LED_BUILTIN, !digitalRead(LED_BUILTIN));
delay(50);
}
}
unsigned long lastping = 0; // time of last "user" ping
unsigned long waitTime = 60*1000; // interval before next "user" ping (1 to 2 minutes)
void loop() {
// report ping failure
if (millis() - lastValidPing > INTERNET_LOST*1000) {
Serial.printf("%lu: **** PINGING FAILED **** NO PING IN LAST %d SECONDS ****\n", millis(), INTERNET_LOST);
lastValidPing = millis(); // restart test
}
// try one-off "user" ping
if (millis() - lastping > waitTime) {
Serial.printf("\n%lu: Remaining free mem: %u\n", millis(), ESP.getFreeHeap());
if (millis() % 2 == 0)
sendUserPing("bing.com"); // default 3 pings sent with 1 second time out
else
sendUserPing("192.168.11.118", 2, 500); // no reply expected, 2 pings sent with 1/2 second time out
lastping = millis();
waitTime = (60+random(60))*1000;
}
delay(2000);
blinkLED();
}
Scroll down to the loop()
function and notice how it does very little work. At each iteration, it checks the time elapsed since the last ping reply was received from a target host and if it has been too long, it prints out an alert. In addition to logging the situation, the router watchdog will reboot the router at this point. The loop()
also sends a ping to a fourth (valid) or fifth (invalid) host at irregular intervals of 1 to 2 minutes to mimic use of the sendUserping
routine and blinks the built-in LED every couple of seconds. The interesting bits are done in the background. Let's look at that.
Scroll up to the beginning of the sketch. Besides the Arduino
library, three libraries are used. The ESP8266WiFi
library to connect to the wireless network for access to the Internet, the Ticker
library to run two timers: one to send out pings at regular intervals (define by the PING_INTERVAL
macro) and one to report the sent and received statistics at longer intervals. The macro REPORT_INTERVAL
specifies when this is done. In the router watchdog, longer intervals will be set. Finally, the AsyncPing
library provides the object that sends and receives ICMP requests and replies.
The INTERNET_LOST
macro specifies the number of seconds since the last reply to a ping request before the connexion to the Internet is deemed lost. Two macros define the parameters for establishing the reliability of ping targets. The first PING_SAMPLE_SIZE
is the minimum number of ping requests needed before calculating anything. The second macro, UNRELIABLE
(a value between 1 and 10), is used to calculate the minimum number of replies that should have been received from a reliable target. The function reportTargetStatus()
will issue a warning if the percentage of received replies over the number of sent requests is below 10*UNRELIABLE
. When the number of ping requests sent to each of the targets exceeds RESET_COUNTER
, the sent and received statistics are reset to 0. With the values shown, this will occur every 10 days or so.
The last user modifiable macro is TARGET_COUNT
which sets the number of target sites to which ping request will be sent. The array of host names, pingHosts
, can contain URLs or IP addresses. They should point to leading Internet sites that will reliably respond to ICMP requests. Everything after should not need to be modified.
Hopefully the reportTargetStatus()
and resetTargetStatus()
functions have names that clearly show their purpose and are simple enough not to need explanation. The function setupTargets
converts the strings in the pingHosts
array into IPAddress objects and stores these in the targets
array. If a URL or IP address is incorrect, the IpAddress object is not created. At the end the variable targetCount
will contain the number of valid IPAddress objects in targets
.
Four objects are created: two AsyncPing
"pingers" and two timers. The timer targetPinger
repeatedly calls sendTargetPing
at regular intervals of PING_INTERVAL
seconds. When invoked, the sendTargetPing
function instructs targetPinger
, an AsyncPing
object, to send a ping request to the next target host and returns immediately without waiting for a reply. Again, this is why it is called asynchronous or non-blocking. If targetPinger
receives a reply from a host to which it sent a ping request, the targetPingerCallback
function will be executed. All it does is update the lastValidPing
time and the ping received count for the host that sent the reply.
The sendUserPing
function will be called by the user to send a ping request to a specific host. In this case two call back functions are defined: userPingerRecvCallback
which reports on individual ping replies from the host as they arrive and userPingerFinalCallback
which will be called when the user initiated ping request times out and reports on the ICMP packet sent and received statistics.
For the purpose of this post, there are two important parts to the setup
function. The first part assigns the appropriate call back functions as handlers for the two AsyncPing
objects. An AsyncPing
object has two handlers: _on_recv
called as each ping reply comes in and the _on_sent
handler (perhaps not the best name?) called when the ping request times out or is completed. Which handler is set with the <AsyncPingObject>.on
method is determined by the value of the boolean parameter named mode
which is the first parameter of the method. When mode
is true
, the second parameter should be the call back function to assign to _on_recv
. When mode
is false
, _on_sent
is assigned the second parameter of the method.
In the second important part of the initialization code, the two timers are attached to the two tasks that must be done regularly: sending pings to target hosts, and reporting on the ping statistics.
Here is part of the serial output of the program.
Wait for WiFi .......
WiFi connected, IP address: 192.168.11.142, gateway IP address: 192.168.11.1
"bringggx.carbs" is not a valid host name or Ip address
Setup completed with 2 ping targets in place
Remaining free mem: 50824
14994: Sending ping to target[0] google.com
15047: Ping reply from target[0] google.com received
24994: Sending ping to target[1] 8.8.8.8
25040: Ping reply from target[1] 8.8.8.8 received
34994: Sending ping to target[0] google.com
...
55042: Ping reply from target[0] google.com received
61025: Remaining free mem: 50320
61060: Sending ping to bing.com (204.79.197.200)
61133: 64 bytes from 204.79.197.200: icmp_seq=1 ttl=117 time=70 ms
62114: 64 bytes from 204.79.197.200: icmp_seq=2 ttl=117 time=50 ms
63112: 64 bytes from 204.79.197.200: icmp_seq=3 ttl=117 time=47 ms
64064: 3 pings sent to 204.79.197.200, 3 received, time: 3003 ms
64994: Sending ping to target[1] 8.8.8.8
65042: Ping reply from target[1] 8.8.8.8 received
...
124995: Ping target : received / sent counts
google.com : 6 / 6
8.8.8.8 : 5 / 6
...
175510: Remaining free mem: 50288
175511: Sending ping to bing.com (204.79.197.200)
175564: 64 bytes from 204.79.197.200: icmp_seq=1 ttl=117 time=51 ms
176557: 64 bytes from 204.79.197.200: icmp_seq=2 ttl=117 time=43 ms
177561: 64 bytes from 204.79.197.200: icmp_seq=3 ttl=117 time=46 ms
178514: 3 pings sent to 204.79.197.200, 3 received, time: 3003 ms
...
382411: Remaining free mem: 50320
382411: Sending ping to 192.168.11.118 (192.168.11.118)
382913: no reply yet from 192.168.11.118 icmp_seq=1
383415: no reply yet from 192.168.11.118 icmp_seq=2
383416: 2 pings sent to 192.168.11.118, 0 received, time: 1004 ms
...
1204995: Ping target : received / sent counts
google.com : 60 / 60
8.8.8.8 : 59 / 60
As can be seen, the program carries with the two remaining target hosts when one of the hosts has an invalid host name. Each time a user specified host is pinged, the remaining free memory is printed to check for memory leaks. Because that oscillates between 50288 and 50320 bytes, probably depending on memory allocations done in the background, it looks like there are no leaks.
The example programs included with the AsyncPing
library specified the call back functions as lambda function, also called anonymous functions. I chose to explicitly define the call backs. I find it easier to follow and cleaner. But then, I am not really familiar with C/C++. More than once in these page, I have said that Pascal, or more precisely, Free Pascal is my preferred programming language and it does not feature anonymous function (although that may soon change Anonymous functions are "planned" this year). If you prefer using lambda functions, I have included in the project archive a version with them: main.lambda
. Note that the AsyncPing
library is included in the local lib
directory to make the example self contained. Download the project by clicking on asyncping_test.zip.
Right now I am rather pleased with this Internet watchdog and I will soon use it in the router watchdog. I wish to thank the author of AsyncPing
for a very useful library.