NTP checks with icinga2

On my new Icinga2 monitoring host, I am slowly adding additional service checks to achieve parity with my existing Nagios monitoring. Next on my list, implementing NTP checks. The first step was to add a new service check to the Icinga2 configuration:

/etc/icinga2/conf.d/services.cfg:

apply Service "ntp_time" {
  import "generic-service"
  check_command = "ntp_time"
  assign where host.vars.os == "Linux"
}

The service check produced an error, as seen in the icingaweb2 interface:

execvpe(/usr/lib64/nagios/plugins/check_ntp_time) failed: No such file or directory

Oh! I don’t have the appropriate Nagios plugin installed on the Icinga2 host.

sudo yum install nagios-plugins-ntp

The NTP service check now reports OK on some hosts, but on other hosts I get a different error:

CRITICAL: No response from NTP server

The hosts that did not receive a response are all using chronyd. I edited /etc/chrony.conf and added:

allow 192.168.46.46

And restarted chronyd:

systemctl restart chronyd

Now all but one host reports OK. The last remaining host to show an error? The Icinga2 host itself!

allow 127.0.0.1

Another chronyd restart, and the NTP service on all hosts reports OK.

NRPE: Unable to read output

This one was a real facepalm moment, but I thought I’d share in case anyone else runs into the same thing.

I’ve been working on migrating from Nagios to Icinga2. One of the services I monitor is whether or not a given host has any available yum updates. This service, which I label check_yum, worked on all my hosts except for the Icinga2 host. All the other services monitored on that host were working, but check_yum returned an error:

NRPE: Unable to read output

I tried running the test manually on the Icinga2 host:

/usr/lib64/nagios/plugins/check_nrpe -H localhost -c check_yum
NRPE: Unable to read output

I checked to make sure NRPE was listening, in this case via xinetd:

lsof -i

I checked the service definition to see what script/plugin NRPE runs:

cat /etc/nrpe.d/check_yum.cfg
command[check_yum]=/usr/lib64/nagios/plugins/check_updates -w 0 -c 10 -t 60

I tried to run that manually and…the file /usr/lib64/nagios/plugins/check_updates did not exist.

I installed the corresponding yum package:

sudo yum install nagios-plugins-check-updates

Now it works! It was a reminder to myself to check the basics before trying to troubleshoot network issues.

Nagios alert: CRITICAL: No response from NTP server

One of a pair of new hosts was causing the following Nagios alert today:

CRITICAL: No response from NTP server

Both of the new systems have the same configuration in theory, but based on the different results something clearly was overlooked.

I tried running NTP from the Nagios host:

Host 1

$ check_ntp -H ephemeralbox1.osric.net -w 0.1 -c 0.2
NTP OK: Offset -0.02545583248 secs|offset=-0.025456s;0.100000;0.200000;

Host 2

$ check_ntp -H ephemeralbox2.osric.net -w 0.1 -c 0.2
CRITICAL: No response from NTP server

The iptables rules look the same on both. The hosts are all on the same LAN, so there’s no firewall in the way.

Both systems are running chronyd:

Host 1

[chris@ephemeralbox1 ssh]$ systemctl show chronyd | egrep '(ActiveState|SubState)'
ActiveState=active
SubState=running

Host 2

[chris@ephemeralbox2 ssh]$ systemctl show chronyd | egrep '(ActiveState|SubState)'
ActiveState=active
SubState=running

Both systems are listening on port 123:

Host 1

[chris@ephemeralbox1 ssh]$ sudo lsof -i :123
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
chronyd 3027 chrony 3u IPv4 1095448 0t0 UDP *:ntp

Host 2

[chris@ephemeralbox2 ssh]$ sudo lsof -i :123
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
chronyd 1241 chrony 3u IPv4 51276 0t0 UDP *:ntp

Finally, I found it. In the obvious place that perhaps I should have looked first. The /etc/chrony.conf file on Host 2 was missing the allow line for the Nagios host:

# Allow NTP client access from Nagios host
allow 192.168.100.100

And the first place I looked was iptables. Blame the firewall, after all. The configurations were both pushed to these systems via Ansible playbooks, but apparently I had not included the role that updates the chrony.conf file on the 2nd host. Looks like I need configuration management management!

Nagios check_disk returns DISK CRITICAL – /sys/kernel/config is not accessible: Permission denied

I enabled Nagios checks for free disk space on a group of servers today, and was hit with alerts containing the following error message:
DISK CRITICAL - /sys/kernel/config is not accessible: Permission denied

If you are looking for a solution, skip to the end. Some of my mistakes before finding the solution may be interesting though!

Continue reading Nagios check_disk returns DISK CRITICAL – /sys/kernel/config is not accessible: Permission denied

check_http returns 403 Forbidden on fresh Nagios installation

I recently installed a Nagios server on a new CentOS 7 virtual machine (on Virtual Box).

One of the default checks included upon installation is a check on localhost to confirm that the HTTP server is responding. (First I had to install the check_http plugin, see previous post.) The Nagios web interface reports a warning for this check:

HTTP WARNING: HTTP/1.1 403 Forbidden - 5261 bytes in 0.001 second response time

This is unexpected, since I can request the same page in a browser, which returns the Apache Welcome page.

When I run the check manually I get the same result, as expected:

# /usr/lib64/nagios/plugins/check_http -H localhost
HTTP WARNING: HTTP/1.1 403 Forbidden - 5261 bytes in 0.001 second response time |time=0.000907s|;;;0.000000 size 5261B;;;0

I checked with curl:

# curl http://localhost

This returns the HTML source of the Apache Welcome page. It looks like it is working, right? But looking at the headers returned by the Apache server also shows 403 Forbidden:

# curl -I http://localhost
HTTP/1.1 403 Forbidden
...

The Apache Welcome page gives some hints about this behavior:

Are you the Administrator?

You should add your website content to the directory /var/www/html/.

To prevent this page from ever being used, follow the instructions in the file /etc/httpd/conf.d/welcome.conf.

The /etc/httpd/conf.d/welcome.conf file begins with the following comments and directive:

#
# This configuration file enables the default "Welcome" page if there
# is no default index page present for the root URL.  To disable the
# Welcome page, comment out all the lines below.
#
# NOTE: if this file is removed, it will be restored on upgrades.
#
<LocationMatch "^/+$">
    Options -Indexes
    ErrorDocument 403 /.noindex.html
</LocationMatch>

The Apache config is specifying that if there is no index page for the document root, return the Welcome page as an error document with a 403 HTTP status code.

Once I added a basic HTML file at /var/www/html/index.html, Nagios returned a success message:

HTTP OK: HTTP/1.1 200 OK - 549 bytes in 0.001 second response time

Missing Nagios plugins in CentOS 7

I set up a Nagios server on a CentOS 7 VM (Virtual Machine):

sudo yum install epel-release
sudo yum install nrpe
sudo yum install nagios

By default it sets up some basic checks for localhost. When I checked the Nagios site at http://127.0.0.1/nagios/, I found that even PING was critical:

(No output on stdout) stderr: execvp(/usr/lib64/nagios/plugins/check_ping, ...) failed. errno is 2: No such file or directory

I checked the contents of the plugins directory:

# ls /usr/lib64/nagios/plugins
eventhandlers negate urlize utils.sh

Sure enough, the usual suspects are not there. E.g.:

  • check_load
  • check_ping
  • check_disk
  • check_http
  • check_procs

Eventually I stumbled onto the following document, /usr/share/doc/nagios-plugins-2.0.3/README.Fedora:

Fedora users

Nagios plugins for Fedora have all been packaged separately. For
example, to isntall the check_http just install nagios-plugins-http.

All plugins are installed in the architecture dependent directory
/usr/lib{,64}/nagios/plugins/.

I installed some of the plugins following that convention:

sudo yum install nagios-plugins-load
sudo yum install nagios-plugins-ping
sudo yum install nagios-plugins-disk
sudo yum install nagios-plugins-http
sudo yum install nagios-plugins-procs

Now the the corresponding plugins exist in /usr/lib64/nagios/plugins, and Nagios reports OK for those checks on localhost.