2020-04-11
md
Asynchronous Ping for an ESP8266 Internet Watchdog
<-An ESP8266 Based Router Watchdog

There have been unwarranted reboots of the router by the original router watchdog. The exact source of the problem is not entirely clear. However looking over the code, it struck me that using a blocking (or synchronous) ping library to check the status of the Internet connection was not the proper approach. In this post, I will be presenting an example project that monitors if the Internet can be reached using the AsyncPing library by akaJes. Just about the same code will be used to patch the router watchdog.

The starting point for my example PlatformIO project is the ping_interval.ino example by akaJes. By luck, that sketch does almost the same thing as needed for an Internet watchdog. At regular intervals it sends out 3 pings to 3 sites. These are non-blocking operations; as soon as the ICMP requests are sent, control returns to the normal loop() function without waiting for replies from the pinged sites. When ping replies do come in, program flow is interrupted to report which site replied.

In the Internet watchdog, only a single ping is sent out at regular intervals. The host or target of the ping request is chosen in round-robin fashion. The absence of a reply from a single target host will not be construed as a network failure. Instead the Internet is deemed down when no reply has been received from any of the target hosts for a specified number of seconds. It seemed like a good idea to collect some statistics about the number of replies compared to the number of ping requests sent out. For debugging and information purposes, a report of these statistics is printed and targets that do not reply reliably are identified.

The example program also includes a user ping function because the router watchdog had such a function mostly for the purpose of debugging and checking that the ping targets were correctly specified.

#include <Arduino.h> #include <ESP8266WiFi.h> #include <Ticker.h> #include "AsyncPing.h"            // https://github.com/akaJes/AsyncPing #define WIFI_SSID "a_SSID"        // Wi-Fi network SSID (network name) #define WIFI_PSK  "a_psk"         // Wi-Fi network pre-shared key (password) #define INTERNET_LOST     300     // seconds since last received ping before the Internet connection is deemed lost #define PING_INTERVAL      10     // seconds between ping requests #define REPORT_INTERVAL   120     // seconds between report on ping sent and received counts #define PING_SAMPLE_SIZE   10     // minimum ping count for testing for targets that don't reliably reply, should be 10+ #define UNRELIABLE          7     // target unreliable if response rate is less than UNRELIABLE/10 #define RESET_COUNTER   90000     // ping sent and received counts reset after RESET_COUNTER requests sent to all targets #define TARGET_COUNT        3     // number of ping targets // target hosts that will be pinged on a regular basis, can be identified by host name or IP address const char* pingHosts[TARGET_COUNT] = {"google.com", "bringggx.carbs", "8.8.8.8"};  // with one bad host name to test setupTargets() /*-------------------------------------------------------------------------------------*/ IPAddress targets[TARGET_COUNT];    // array of valid IP addresses of ping targets int pingSentCount[TARGET_COUNT];    // number of pings sent to each target int pingRcvdCount[TARGET_COUNT];    // number of replies received from each target int hostsIndex[TARGET_COUNT];       // reverse index from last 3 arrays to pingHosts array int targetCount = 0;                // number of valid IP addresses in targets array int pingIndex = 0;                  // index of next target to ping unsigned long lastValidPing = 0;    // the last time a ping was received from a target AsyncPing targetPinger;             // object to send successive pings to the target sites AsyncPing userPinger;               // object to send ping request to a user specified host Ticker pingTimer;                   // object to time the sending of pings to target sites Ticker reportTimer;                 // object to time reporting on the status of sent ping requests // print statistics about sent and received ICMP packets and warn about unreliable targets void reportTargetStatus(void) {  if (targetCount < 1) return;  Serial.printf("\n%lu: Ping target : received / sent  counts\n", millis());  for (int k=0; k<targetCount; k++) {    bool unreliable = ( (pingSentCount[k] > PING_SAMPLE_SIZE) && (pingRcvdCount[k] < (int) ((UNRELIABLE*pingSentCount[k])/10)) );    Serial.printf("     %s : %d / %d%s\n", pingHosts[hostsIndex[k]], pingRcvdCount[k], pingSentCount[k], (unreliable) ? " *** WARNING: unreliable target ***" : "");  }  Serial.println(); } // reset the send and receive statistics void resetTargetStatus(void) {  pingIndex = 0;  memset(pingSentCount, 0, sizeof(pingSentCount));  memset(pingRcvdCount, 0, sizeof(pingRcvdCount)); } // pingHost URL's and IP addresses to IPAddress objects void setupTargets(void) {  targetCount = 0;  for (int i = 0; i < TARGET_COUNT; i++) {    if (WiFi.hostByName(pingHosts[i], targets[targetCount])) {      hostsIndex[targetCount] = i;      targetCount++;    } else {      Serial.printf("\"%s\" is not a valid host name or Ip address\n", pingHosts[i]);    }    }  resetTargetStatus(); } // send a ping request to the next valid targer IP address and increment its sent statistic void sendTargetPing() {    if (targetCount < 1) return;  if ((pingIndex == 0) && (pingSentCount[0] > RESET_COUNTER)) resetTargetStatus();  Serial.printf("%lu: Sending ping to target[%d] %s\n", millis(), pingIndex, pingHosts[hostsIndex[pingIndex]]);  targetPinger.begin(targets[pingIndex], 1, 5000);  // 1 ping, timeout in 5 seconds  pingSentCount[pingIndex]++;  pingIndex = (pingIndex+1)%targetCount; } // function that is called when a ping reply arrives from one of the target hosts bool targetPingerCallback(const AsyncPingResponse& response) {  if (response.answer) {    for (int j = 0; j < targetCount; j++) {      if (response.addr == targets[j]) {        Serial.printf("%lu: Ping reply from target[%d] %s received\n", millis(), j, pingHosts[hostsIndex[j]]);        // if (millis() % 2 == 0)  // remove leading // to test unreliable target report        pingRcvdCount[j]++;               break;      }    }      lastValidPing = millis();  // add leading // to test ping failure  }    return true; // done } // send a ping request to a specific host. ipaddress can be a URL or an IP address void sendUserPing(const char* ipaddress, u8_t count = 3, u32_t timeout = 1000) {  IPAddress ip;   if (WiFi.hostByName(ipaddress, ip)) {    Serial.printf("%lu: Sending ping to %s (%s)\n", millis(), ipaddress, ip.toString().c_str());    userPinger.begin(ip, count, timeout);  // 3 pings, timeout 1000 these are the default values  } else {                 Serial.printf("%lu: Could not create valid IP address for %s\n", millis(), ipaddress);  }     } // function that is called when a ping reply arrives from the user specified host bool userPingerRecvCallback(const AsyncPingResponse& response) {  IPAddress addr(response.addr); //to prevent with no const toString() in 2.3.0  if (response.answer)    Serial.printf("%lu: %d bytes from %s: icmp_seq=%d ttl=%d time=%lu ms\n", millis(), response.size, addr.toString().c_str(), response.icmp_seq, response.ttl, response.time);  else    Serial.printf("%lu: no reply yet from %s icmp_seq=%d\n", millis(), addr.toString().c_str(), response.icmp_seq);  return false; //do not stop } // function that is called when the user ping request times out bool userPingerFinalCallback(const AsyncPingResponse& response) {  IPAddress addr(response.addr); //to prevent with no const toString() in 2.3.0  Serial.printf("%lu: %d pings sent to %s, %d received, time: %lu ms\n", millis(), response.total_sent, addr.toString().c_str(), response.total_recv, response.total_time);  if (response.mac)    Serial.printf("  detected eth address " MACSTR "\n",MAC2STR(response.mac->addr));  Serial.println();  return true;  // done (does not matter) } void setup() {  // setup the serial connection  Serial.begin(115200);  while(!Serial) delay(10);  Serial.println();  Serial.println();  // setup the Wi-Fi connection  WiFi.disconnect(true);  WiFi.mode(WIFI_STA);  WiFi.begin(WIFI_SSID, WIFI_PSK);  Serial.print("Wait for WiFi ");  while (WiFi.status() != WL_CONNECTED) {    delay(500);    Serial.print(".");  }  Serial.print("\nWiFi connected, IP address: ");  Serial.print(WiFi.localIP());  Serial.print(", gateway IP address: ");  Serial.println(WiFi.gatewayIP());  Serial.println("\n");  // intialize the targets[] array of IP addresses based on the given pingHosts  setupTargets();  // setup targetPinger, the targets pinger  targetPinger.on(true,  targetPingerCallback);  // setup userPinger, the one-off ping to a user specified host  userPinger.on(true, userPingerRecvCallback);  userPinger.on(false, userPingerFinalCallback);  // setup the timers that will run target pinging and status reporting in the background  pingTimer.attach(PING_INTERVAL, sendTargetPing);  // send a ping to a target every 10 seconds  reportTimer.attach(REPORT_INTERVAL, reportTargetStatus);  // report the status every two minutes  // initialize the on-board LED  pinMode(LED_BUILTIN, OUTPUT);  digitalWrite(LED_BUILTIN, HIGH); // turn LED off (using a Lolin/Wemos D1 mini for testing)  Serial.printf("Setup completed with %d ping targets in place\n", targetCount);  lastValidPing = millis();  delay(1000);  Serial.printf("Remaining free mem: %u\n\n", ESP.getFreeHeap()); } void blinkLED(void) {  for (int i=0; i<4; i++) {  // loop limit should be an even integer    digitalWrite(LED_BUILTIN, !digitalRead(LED_BUILTIN));    delay(50);  }   } unsigned long lastping = 0;        // time of last "user" ping unsigned long waitTime = 60*1000;  // interval before next "user" ping (1 to 2 minutes) void loop() {  // report ping failure  if (millis() - lastValidPing > INTERNET_LOST*1000) {    Serial.printf("%lu: **** PINGING FAILED **** NO PING IN LAST %d SECONDS ****\n", millis(), INTERNET_LOST);    lastValidPing = millis();  // restart test  }    // try one-off "user" ping  if (millis() - lastping > waitTime) {   Serial.printf("\n%lu: Remaining free mem: %u\n", millis(), ESP.getFreeHeap());   if (millis() % 2 == 0)     sendUserPing("bing.com");                 // default 3 pings sent with 1 second time out   else     sendUserPing("192.168.11.118", 2, 500);  // no reply expected, 2 pings sent with 1/2 second time out   lastping = millis();   waitTime = (60+random(60))*1000;  }  delay(2000);  blinkLED(); }

Scroll down to the loop() function and notice how it does very little work. At each iteration, it checks the time elapsed since the last ping reply was received from a target host and if it has been too long, it prints out an alert. In addition to logging the situation, the router watchdog will reboot the router at this point. The loop() also sends a ping to a fourth (valid) or fifth (invalid) host at irregular intervals of 1 to 2 minutes to mimic use of the sendUserping routine and blinks the built-in LED every couple of seconds. The interesting bits are done in the background. Let's look at that.

Scroll up to the beginning of the sketch. Besides the Arduino library, three libraries are used. The ESP8266WiFi library to connect to the wireless network for access to the Internet, the Ticker library to run two timers: one to send out pings at regular intervals (define by the PING_INTERVAL macro) and one to report the sent and received statistics at longer intervals. The macro REPORT_INTERVAL specifies when this is done. In the router watchdog, longer intervals will be set. Finally, the AsyncPing library provides the object that sends and receives ICMP requests and replies.

The INTERNET_LOST macro specifies the number of seconds since the last reply to a ping request before the connexion to the Internet is deemed lost. Two macros define the parameters for establishing the reliability of ping targets. The first PING_SAMPLE_SIZE is the minimum number of ping requests needed before calculating anything. The second macro, UNRELIABLE (a value between 1 and 10), is used to calculate the minimum number of replies that should have been received from a reliable target. The function reportTargetStatus() will issue a warning if the percentage of received replies over the number of sent requests is below 10*UNRELIABLE. When the number of ping requests sent to each of the targets exceeds RESET_COUNTER, the sent and received statistics are reset to 0. With the values shown, this will occur every 10 days or so.

The last user modifiable macro is TARGET_COUNT which sets the number of target sites to which ping request will be sent. The array of host names, pingHosts, can contain URLs or IP addresses. They should point to leading Internet sites that will reliably respond to ICMP requests. Everything after should not need to be modified.

Hopefully the reportTargetStatus() and resetTargetStatus() functions have names that clearly show their purpose and are simple enough not to need explanation. The function setupTargets converts the strings in the pingHosts array into IPAddress objects and stores these in the targets array. If a URL or IP address is incorrect, the IpAddress object is not created. At the end the variable targetCount will contain the number of valid IPAddress objects in targets.

Four objects are created: two AsyncPing "pingers" and two timers. The timer targetPinger repeatedly calls sendTargetPing at regular intervals of PING_INTERVAL seconds. When invoked, the sendTargetPing function instructs targetPinger, an AsyncPing object, to send a ping request to the next target host and returns immediately without waiting for a reply. Again, this is why it is called asynchronous or non-blocking. If targetPinger receives a reply from a host to which it sent a ping request, the targetPingerCallback function will be executed. All it does is update the lastValidPing time and the ping received count for the host that sent the reply.

The sendUserPing function will be called by the user to send a ping request to a specific host. In this case two call back functions are defined: userPingerRecvCallback which reports on individual ping replies from the host as they arrive and userPingerFinalCallback which will be called when the user initiated ping request times out and reports on the ICMP packet sent and received statistics.

For the purpose of this post, there are two important parts to the setup function. The first part assigns the appropriate call back functions as handlers for the two AsyncPing objects. An AsyncPing object has two handlers: _on_recv called as each ping reply comes in and the _on_sent handler (perhaps not the best name?) called when the ping request times out or is completed. Which handler is set with the <AsyncPingObject>.on method is determined by the value of the boolean parameter named mode which is the first parameter of the method. When mode is true, the second parameter should be the call back function to assign to _on_recv. When mode is false, _on_sent is assigned the second parameter of the method.

In the second important part of the initialization code, the two timers are attached to the two tasks that must be done regularly: sending pings to target hosts, and reporting on the ping statistics.

Here is part of the serial output of the program.

Wait for WiFi ....... WiFi connected, IP address: 192.168.11.142, gateway IP address: 192.168.11.1 "bringggx.carbs" is not a valid host name or Ip address Setup completed with 2 ping targets in place Remaining free mem: 50824 14994: Sending ping to target[0] google.com 15047: Ping reply from target[0] google.com received 24994: Sending ping to target[1] 8.8.8.8 25040: Ping reply from target[1] 8.8.8.8 received 34994: Sending ping to target[0] google.com ... 55042: Ping reply from target[0] google.com received 61025: Remaining free mem: 50320 61060: Sending ping to bing.com (204.79.197.200) 61133: 64 bytes from 204.79.197.200: icmp_seq=1 ttl=117 time=70 ms 62114: 64 bytes from 204.79.197.200: icmp_seq=2 ttl=117 time=50 ms 63112: 64 bytes from 204.79.197.200: icmp_seq=3 ttl=117 time=47 ms 64064: 3 pings sent to 204.79.197.200, 3 received, time: 3003 ms 64994: Sending ping to target[1] 8.8.8.8 65042: Ping reply from target[1] 8.8.8.8 received ... 124995: Ping target : received / sent counts google.com : 6 / 6 8.8.8.8 : 5 / 6 ... 175510: Remaining free mem: 50288 175511: Sending ping to bing.com (204.79.197.200) 175564: 64 bytes from 204.79.197.200: icmp_seq=1 ttl=117 time=51 ms 176557: 64 bytes from 204.79.197.200: icmp_seq=2 ttl=117 time=43 ms 177561: 64 bytes from 204.79.197.200: icmp_seq=3 ttl=117 time=46 ms 178514: 3 pings sent to 204.79.197.200, 3 received, time: 3003 ms ... 382411: Remaining free mem: 50320 382411: Sending ping to 192.168.11.118 (192.168.11.118) 382913: no reply yet from 192.168.11.118 icmp_seq=1 383415: no reply yet from 192.168.11.118 icmp_seq=2 383416: 2 pings sent to 192.168.11.118, 0 received, time: 1004 ms ... 1204995: Ping target : received / sent counts google.com : 60 / 60 8.8.8.8 : 59 / 60

As can be seen, the program carries with the two remaining target hosts when one of the hosts has an invalid host name. Each time a user specified host is pinged, the remaining free memory is printed to check for memory leaks. Because that oscillates between 50288 and 50320 bytes, probably depending on memory allocations done in the background, it looks like there are no leaks.

The example programs included with the AsyncPing library specified the call back functions as lambda function, also called anonymous functions. I chose to explicitly define the call backs. I find it easier to follow and cleaner. But then, I am not really familiar with C/C++. More than once in these page, I have said that Pascal, or more precisely, Free Pascal is my preferred programming language and it does not feature anonymous function (although that may soon change Anonymous functions are "planned" this year). If you prefer using lambda functions, I have included in the project archive a version with them: main.lambda. Note that the AsyncPing library is included in the local lib directory to make the example self contained. Download the project by clicking on asyncping_test.zip.

Right now I am rather pleased with this Internet watchdog and I will soon use it in the router watchdog. I wish to thank the author of AsyncPing for a very useful library.

<-An ESP8266 Based Router Watchdog