I enabled Nagios checks for free disk space on a group of servers today, and was hit with alerts containing the following error message:
DISK CRITICAL - /sys/kernel/config is not accessible: Permission denied
If you are looking for a solution, skip to the end. Some of my mistakes before finding the solution may be interesting though!
The wrong solution
Permission denied? I had a hunch SELinux was behind this. SELinux is behind every unexpected permissions problem lately. But before jumping to any conclusions, Google the error message. It’s rare to run across a problem that no one else has had before.
Sure enough, I found a RedHat bug report describing the same issue with the check_disk plugin. The developers closed the bug, saying that a new version has been released, and if it is still a problem someone should open a new bug for the new version. Initially I thought that was a terrible assumption. “We released a new version and have not confirmed whether or not this is still a bug. Therefore this isn’t a bug unless you do the work to re-report it.” Now that I have determined that it is not and likely never was a bug, I’m not sure I feel the same.
The bug report mentions a workaround for the Nagios check_disk failure. It is in fact a successful workaround. I don’t entirely like it, but assuming that only the root user can modify
/usr/lib64/nagios/plugins/check_disk it seems like an acceptable risk. Still, recall that one of the benefits of SELinux is that even if another process owned by a user is compromised it doesn’t mean everything owned by that user gets compromised.
Compare before and after:
$ ls --context /usr/lib64/nagios/plugins/check_disk -rwxr-xr-x. root root system_u:object_r:nagios_checkdisk_plugin_exec_t:s0 /usr/lib64/nagios/plugins/check_disk $ sudo chcon -t nagios_unconfined_plugin_exec_t /usr/lib64/nagios/plugins/check_disk $ ls --context /usr/lib64/nagios/plugins/check_disk -rwxr-xr-x. root root system_u:object_r:nagios_unconfined_plugin_exec_t:s0 /usr/lib64/nagios/plugins/check_disk
The SELinux type is now set to
Can I set the SELinux context via Ansible?
I need to make this change on several servers, and I need the change to be documented and repeatable, so I need to see how to best make that happen via Ansible.
The Ansible sefcontext module says it’s similar to the
semanage fcontext command, so it seemed like a good choice.
First I tried the
semange fcontext command directly:
$ sudo semanage fcontext -m -t nagios_unconfined_plugin_exec_t /usr/lib64/nagios/plugins/check_disk ValueError: File spec /usr/lib64/nagios/plugins/check_disk conflicts with equivalency rule '/usr/lib64 /usr/lib'; Try adding '/usr/lib/nagios/plugins/check_disk' instead $ sudo semanage fcontext -m -t nagios_unconfined_plugin_exec_t /usr/lib/nagios/plugins/check_disk ValueError: File context for /usr/lib/nagios/plugins/check_disk is not defined $ sudo semanage fcontext --list /usr/lib64/nagios/plugins/check_disk | grep check_disk /usr/lib/nagios/plugins/check_disk regular file system_u:object_r:nagios_checkdisk_plugin_exec_t:s0 /usr/lib/nagios/plugins/check_disk_smb regular file system_u:object_r:nagios_checkdisk_plugin_exec_t:s0
The file looked like it had a defined context, didn’t it? A comment on the Fedora SELinux support list had good advice:
If the file context is not already defined in your local modification, you need to add is [sic], not modify
I tried again, adding instead of modifying, and comparing context before and after the change:
$ ls --context /usr/lib64/nagios/plugins/check_disk -rwxr-xr-x. root root system_u:object_r:nagios_checkdisk_plugin_exec_t:s0 /usr/lib64/nagios/plugins/check_disk $ sudo semanage fcontext -a -t nagios_unconfined_plugin_exec_t /usr/lib/nagios/plugins/check_disk $ sudo restorecon /usr/lib64/nagios/plugins/check_disk $ ls --context /usr/lib64/nagios/plugins/check_disk -rwxr-xr-x. root root system_u:object_r:nagios_unconfined_plugin_exec_t:s0 /usr/lib64/nagios/plugins/check_disk
OK! That looks good. Now to do the same with Ansible. An excerpt of my Ansible role is below:
- name: Allow Nagios to execute check_disk (change SELinux type) sefcontext: # Use lib, not lib64 # SELinux defines the equivalency rule '/usr/lib64 /usr/lib' target: /usr/lib/nagios/plugins/check_disk setype: nagios_unconfined_plugin_exec_t state: present
That didn’t work though. See the context returned below:
$ ls --context /usr/lib64/nagios/plugins/check_disk -rwxr-xr-x. root root system_u:object_r:nagios_checkdisk_plugin_exec_t:s0 /usr/lib64/nagios/plugins/check_disk
Maybe it doesn’t run
restorecon? A thread on the Ansible Project Google Group explains that “reload SELinux policy after commit” is not the same as
The sefcontext module is roughly the functionality that ‘semanage fcontext’ provides you. It allows you to add SELinux file context mappings to the internal database.
Now, the module is not intended to change file contexts based on the mapping, just like ‘semanage fcontext’ does not do. (See man semanage)
As you said, you can do this with restorecon, or the file module….
The Ansible files module! I gave it a try. Here’s an excerpt of my new Ansible role:
- name: Allow Nagios to execute check_disk (change SELinux type) file: path: /usr/lib64/nagios/plugins/check_disk setype: nagios_unconfined_plugin_exec_t
Check the SELinux context:
$ ls --context /usr/lib64/nagios/plugins/check_disk -rwxr-xr-x. root root system_u:object_r:nagios_unconfined_plugin_exec_t:s0 /usr/lib64/nagios/plugins/check_disk
That worked! That was easy! I use the Ansible files module all the time and didn’t even know that option was there!
But wait. Have I just solved the wrong problem?
The real solution
Why does the error message say
/sys/kernel/config is not accessible? That isn’t one of the disks in my system, is it?
Turns out, it is. Run
mount to see all the filesystems. There are more of them than you might guess:
$ mount | grep /sys/kernel/config configfs on /sys/kernel/config type configfs (rw,relatime)
configfs is a ram-based filesystem that…is a filesystem-based manager of kernel objects
That’s the real problem. I’m checking a filesystem that I didn’t intend to, and one that Nagios probably shouldn’t have access to.
I checked the default config in
/etc/nagios/nrpe.cfg and found these (the latter is commented-out by default, as shown below):
command[check_hda1]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1 #command[check_disk]=/usr/lib64/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
My config had a custom definition in a cfg file in
command[check_disk]=/usr/lib64/nagios/plugins/check_disk -w 10 -c 5 -X devfs -X$
The above excludes
devfs filesystems, which don’t even exist on my CentOS 7 hosts. The
-X$ looks like a mistake, possibly a copy-paste error of a line truncated by the terminal.
I checked and confirmed that even without the bad exclusions, the check_disk command still produced the same error:
command[check_disk]=/usr/lib64/nagios/plugins/check_disk -w 10 -c 5
I created a revised definition that excludes the problematic filesystem:
command[check_disk]=/usr/lib64/nagios/plugins/check_disk -w 10% -c 5% -X configfs
It works! No more errors.
It might be better, of course, to include only the filesystems I expect to be checking:
command[check_disk]=/usr/lib64/nagios/plugins/check_disk -w 10% -c 5% -N xfs
The above also works.
Be sure to see the check_disk plugin docs for the full list of parameters. It isn’t completely obvious, but the second example implies that, unless otherwise specified,
check_disk tries to check all available filesystems.
- Read the error messages.
- Understand the error messages.
- Google the error messages, but the results don’t mean much if no one else read and understood the error messages.
- Don’t change SELinux settings unless absolutely necessary. The defaults are there for a reason.