Ping903 README See the end of file for copying conditions. * Overview Ping903 is designed to periodically monitor a very large number of remote hosts using ICMP ECHO packets. The system is built using the client-server architecture. The main component (ping903) is a daemon that sits in memory and wakes up periodically to send certain number of ICMP echo packets to a preconfigured number of hosts and to collect replies. The round-trip statistics it collects is made available via REST API. The daemon reads its settings from a plain text configuration file. Most settings have sensible defaults, the only thing that the user must supply is a list of IP addresses to monitor. This list is referred to in this document as "ip-list". A simple command line client utility (ping903q) allows the user to communicate with the daemon, obtaining the needed information about each host in particular, or all monitored hosts at once. This utility can operate in several modes. In particular, it can be used as Nagios external check tool, instead of the standard check_ping tool. * Installation To build ping903 you will need GNU Libmicrohttpd library[3]. It is available for download from http://ftp.gnu.org/gnu/libmicrohttpd. When building from source package, usual incantations apply: ./configure make make install This will install the package under /usr/local. That is, the server will be installed as /usr/local/sbin/ping903, the client program as /usr/local/bin/ping903q, etc. You can give a number of options to ./configure in order to customize your installation, in particular to alter the default installation paths. For example, to install to the /usr file hierarchy, use ./configure --prefix=/usr Please refer to the INSTALL document in this directory for a discussion of available options to configure and their effect. After installing the package, copy the file src/ping903.conf to /etc/ping903.conf and edit it to your liking. This file contains configuration settings that control the behavior of the server daemon and, to a certain extent, that of a query tool. The file contains short annotations before each statement to help you navigate in it. You will find a detailed discussion of the configuration file in the manpage ping903.conf(5). What follows is a short outline intended for quick start: At the very beginning you can leave most settings at their default values. The only statement that you must provide in your configuration is ip-list FILENAME Replace FILENAME with the name of the file with IP addresses to monitor. In this file, each IP address must occupy a separate line. Empty lines, leading and trailing whitespace and comments are ignored. Comments are introduced by a hash sign (#) appearing as the first non-whitespace character on a line. You are not required to keep all your IP addresses in a single file. If necessary, you can scatter them among several files and list each of them in a separate ip-list statement. Normally, the ip-list file should contain IP addresses of the hosts to monitor. It is OK, however, to use symbolic DNS names, too. If a hostname resolves to a single A record such usage is equivalent to placing that IP in the ip-list. However, if the hostnames resolves to multiple IPs, only first one will be used. By default, the server will wake up each minute and send 10 echo requests within 1 second intervals to each registered IP. If the number of collected replies is less than 7, the IP will be declared as dead ("alive": false, in the returned JSON). Otherwise it is considered alive ("alive": true). The following settings control these parameters: probe-interval N Interval between wake-ups in seconds. Default N=60. ping-count N Number of ICMP packets to send within each probe. Default N=10. ping-interval N Interval in seconds between two sequential echo requests. Default N=1. tolerance N Maximum number of lost requests after which the host is considered dead. Default N=3. Another statement worth your attention is "listen". It configures the IP address and port on which the server will listen for incoming HTTP requests. The default is localhost:8080. Change this if this port is already occupied on your system. The access to the HTTP interface is protected by the default access control library (the files /etc/hosts.allow and /etc/hosts.deny). Refer to hosts_access(3) for details. When you have configured the daemon, start it. Just run ping903. Check if there are no errors (on the standard error and in the syslog channel "daemon"). To verify if it is operational, run curl http://localhost:8080/config This should return the running configuration. Within next probe-interval seconds the server will collect enough statistics to answer your queries. You can request information about any particular IP from your ip-list by running ping903q IP This will return the current status of the IP, e.g. $ ping903q 203.0.113.1 203.0.113.1 is alive To get the detailed statistics use the -v option. The result will be formatted in a ping(8)-like manner: $ ping903q -v 203.0.113.1 203.0.113.1 is alive --- 203.0.113.1 ping statistics --- 10 packets transmitted, 10 received, 0% packet loss, time 9414ms rtt min/avg/max/mdev = 41.212/41.265/41.374/0.046 ms You can check the current status of all hosts by running $ ping903q without arguments. Note, that depending on your settings the output can be huge. Please refer to ping903q(1), for a detailed discussion of the tool. * Nagios external check The ping903q tool can be used as a Nagios external check program. The following snipped illustrates the simple Nagios configuration that makes use of it: # Define the check_ping903 command define command { command_name check_ping903 command_line /usr/bin/ping903q -r -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ } # Define the service using the new command define service { host_name server.example.net address 203.0.113.1 service_description Server status check_command check_ping903!200.0,20%!600.0,60% check_interval 5 retry_interval 1 } * Installation from a git clone If you are building from a clone of the Git repository, you will need GNU autotools to bootstrap the package first. Run ./bootstrap in the top level source directory. This will create the configure script and populate the directory with the missing files. Then proceed as described above. * REST API The default channel for communication with the ping903 daemon is the HTTP socket open on localhost port 8080. Only GET requests are allowed. The following endpoints are provided: ** /host/NAME NAME is the IP address or hostname. The server will look up this string in the list of configured hosts and, if found, return the statistics information for that host. Note that NAME is treated as a character string and must coincide exactly with the IP or hostname as it was supplied in configuration. In particular, if a host was specified by its symbolic DNS name in the configuration, exactly that name must be used in URL to obtain statistics for that host. If you wish to use IP, see the or endpoints, discussed below. On success a JSON object is returned. The following keys are defined in that object: - "name": string The IP or hostname of the host under which it was supplied in the ip-list. - "validity": boolean Status of this record. If false, the data has not been collected yet or the host is unreachable. More detailed information is available in the "status" member (see below). If "validity" is false, only the following keys are warranted to be present in the object: "name", "validity", "status", and "xmit-timestamp". If it is true, full statistics is available as described below. - "status": string Detailed status of the object. Following values are defined: "init" Initial state: data are being collected ("validity":false). "valid" The object is valid and its statistics is reliable ("validity": true). "pending" The object is valid, it contains reliable statistics. The host is being probed at the moment and the object will be updated soon ("validity": true). "invalid" Host is unreachable. No statistics available ("validity": false). - "xmit-timestamp": number Time (the number of seconds since the Epoch) when the last ICMP ECHO request was transmitted. - "start-timestamp": number Time when the recent probe sequence was initiated. - "stop-timestamp": number Time when the recent probe sequence was finished. - "xmit": number Number of ICMP ECHO requests transmitted during the probe. - "recv": number Number of ICMP ECHO responses received during the probe. - "loss": number Percentage of lost packets. - "tmin": number Minimal round-trip time observed during the probe. - "tmax": number Maximal round-trip time observed during the probe. - "avg": number Average round-trip time. - "stddev": number Standard deviation of round-trip times. - "alive": boolean Host status computed as a result of the probe. It is true, if the difference between "xmit" and "recv" parameters is less than the "tolerance" configuration setting, and false otherwise. Example of the returned JSON for a reachable host: { "alive":true, "avg":25.85150, "loss":0.00000, "name":"203.0.113.1", "recv":10.00000, "start-timestamp":1581666176.01285, "status":true, "stddev":0.03201, "stop-timestamp":1581666185.27210, "tmax":25.91400, "tmin":25.81200, "xmit":10.00000, "xmit-timestamp":1581666185.24628 } Example of the returned JSON for an unreachable host: { "name":"203.0.113.2", "status":false, "xmit-timestamp":1581666176.01373 } ** /host Return statistics for all monitored hosts. The result is returned as an array of JSON objects described above. This is an experimental endpoint. Be careful with it, as it may cause considerable strain on the server. ** /ip/ADDR Request statistics about a particular IP address. The response is the same as for . Use this API if hostnames are used in your ip-list and you need to request statistics using an IP as opposed to the hostname. ** /match/NAME-OR-IP Return host names that correspond to NAME-OR-IP (a JSON array of strings). If no matches found, empty array is returned. Multiple entries can be returned if NAME-OR-IP is a hostname that has multiple DNS A records, several of which are registered in the ip-list. ** /config Return current server configuration as a JSON object. ** /config/KEYWORD Return the value of a particular configuration setting. * Copyright information: Copyright (C) 2020 Sergey Poznyakoff Permission is granted to anyone to make or distribute verbatim copies of this document as received, in any medium, provided that the copyright notice and this permission notice are preserved, thus giving the recipient permission to redistribute in turn. Permission is granted to distribute modified versions of this document, or of portions of it, under the above conditions, provided also that they carry prominent notices stating who last changed them. Local Variables: mode: outline paragraph-separate: "[ ]*$" version-control: never End: