The Accidental Developer – Page 8 – What if Gregor Samsa awoke a computer programmer?

Using ipmitool to configure Dell iDRAC

I have a number of Dell servers in a remote data center, so I wanted to configure the iDRAC interface in order to power on the systems remotely, get troubleshooting info for Dell, etc., without going to the data center myself. I’ve never configured iDRAC except through the Lifecycle Controller via a crash-cart on bootup. I thought that I would be spending all day in the data center getting everything configured, but when I mentioned this to another sysadmin he said, “Just use ipmitool.”

I had no idea such a tool existed!

First, I installed ipmitool (I’m using CentOS):

sudo yum install ipmitool

I found a helpful website: ipmitool Cheatsheet and Configuring DRAC from ipmitool

I was a little skeptical, but I read through (most) of the ipmitool man page to make sure I had a reasonable idea what the commands would do, and then I tried one. And immediately received an error message:

$ ipmitool lan set 1 ipsrc static
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory

I checked and found that the path listed does exist:

$ ls /dev/ipmi*
/dev/ipmi0

Then it hit me: I need to be superuser, don’t I? That worked!

sudo ipmitool lan set 1 ipsrc static
sudo ipmitool lan set 1 ipaddr 192.168.100.1
sudo ipmitool lan set 1 netmask 255.255.255.0
sudo ipmitool lan set 1 defgw ipaddr 192.168.100.254

I was then able to connect to the IP address in a browser (it warned me there was an untrusted certificate, and I added it as a permanent exception in the browser.)

The default username/password was root/calvin. I changed both the username and password right away. Even though I have the iDRAC interfaces on an RFC 1918 subnet and behind a firewall, why take the risk of keeping the default values?

As I discovered though, pay attention to the iDRAC password restrictions. Otherwise you may need to use ipmitool to reset the iDRAC admin password.

Ansible conditional check failed

I wanted to add a check to one of my Ansible roles so that an application source would be copied and the source recompiled only if no current version existed or if the existing version did not match the expected version:

- name: Check to see if app is installed and the expected version
  command: /usr/local/app/bin/app --version
  register: version_check
  ignore_errors: True
  changed_when: "version_check.rc != 0 or {{ target_version }} not in version_check.stdout"

- name: include app install
  include: tasks/install.yml
  when: "version_check.rc != 0 or {{ target_version }} not in version_check.stdout"

I defined the target version in my role’s defaults/main.yml:

---
target_version: "2.5.2"
...

The first time I ran it, I encountered an error:

fatal: [trinculo.osric.net]: FAILED! => {"failed": true, "msg": "The conditional check 'version_check.rc != 0 or {{ target_version }} not in version_check.stdout' failed. The error was: error while evaluating conditional (version_check.rc != 0 or {{ target_version }} not in version_check.stdout): Unable to look up a name or access an attribute in template string ({% if version_check.rc != 0 or 2.5.2 not in version_check.stdout %} True {% else %} False {% endif %}).\nMake sure your variable name does not contain invalid characters like '-': coercing to Unicode: need string or buffer, StrictUndefined found"}

It’s a little unclear what is wrong, so I figured it was likely an issue with quotes or a lack of parentheses.

First I tried parentheses:

changed_when: "version_check.rc != 0 or ({{ target_version }} not in version_check.stdout)"

No luck.

changed_when: "version_check.rc != 0 or !('{{ target_version }}' in version_check.stdout)"

You know, trying to google and or not or or or is is tricky. Even if you add terms like Boolean logic or propositional calculus.

I tried to break it down into smaller parts:

changed_when: "version_check.rc != 0"

That worked.

changed_when: "!('{{ target_version }}' in version_check.stdout)"

A different error appeared:

template error while templating string: unexpected char u'!'

OK, that’s getting somewhere! Try a variation:

changed_when: "'{{ target_version }}' not in version_check.stdout"

It worked! But with a warning:

[WARNING]: when statements should not include jinja2 templating delimiters
such as {{ }} or {% %}. Found: ('{{ target_version }}' not in version_check.stdout)

Next try:

changed_when: "target_version not in version_check.stdout"

That worked, and without any warnings. I put the or back in:

changed_when: "version_check.rc != 0 or target_version not in version_check.stdout"

That worked! It was the jinja2 delimiters the whole time. The value of the changed_when key is already interpreted as jinja2 apparently, so the delimiters were redundant. Even though it succeeded (with a warning) in a single propositional statement, it failed when the logical disjunction was added. It was an important reminder: error messages aren’t perfect.

FreeIPA: Failed to start pki-tomcatd Service

After a recent CentOS update, FreeIPA 4.5 failed to start with the following error message:
Failed to start pki-tomcatd Service

What changed? The following were the 3 packages updated:

httpd.x86_64
httpd-tools.x86_64
mod_session.x86_64

I successfully restarted FreeIPA without the pki-tomcatd service:
$ sudo ipactl start --ignore-service-failure

But it’s not ideal to run it without the PKI service. What is going on? According to the log at /var/log/pki/pki-tomcat/ca/debug:

java.lang.Exception: Certificate auditSigningCert cert-pki-ca is invalid: Invalid certificate: (-8101) Certificate type not approved for application.

Which cert is that? Where is it? How did it get created? Didn’t FreeIPA create it? Why isn’t it valid? Why doesn’t it give me any additional info?

Eventually I found the certificate location (although I don’t recall how, likely a post on the FreeIPA mailing list):
/var/lib/pki/pki-tomcat/alias -> /etc/pki/pki-tomcat/alias

I ran certutil to find out more about the certificate:
$ certutil -L -d /etc/pki/pki-tomcat/alias certutil: function failed: SEC_ERROR_LEGACY_DATABASE: The certificate/key database is in an old, unsupported format.

That uninformative and misleading error message looked familiar to me. Indeed, I wrote a post about it 7 months ago:
certutil: function failed: SEC_ERROR_LEGACY_DATABASE: The certificate/key database is in an old, unsupported format

$ sudo certutil -L -d /etc/pki/pki-tomcat/alias -n 'auditSigningCert cert-pki-ca'

The expiration date looked fine, which was the first thing I suspected.

I did note the following, which looked interesting:
Mozilla-CA-Policy: false (attribute missing)

But after reading about that at http://mozilla.github.io/ca-policy/ it looked like it shouldn’t be needed.

Fortunately, I have another working FreeIPA replica that I had not yet upgraded, so I compared the certificates on both systems:

On the IPA replica with errors:

$ sudo certutil -L -d /etc/pki/pki-tomcat/alias

Certificate Nickname                                         Trust Attributes
                                                             SSL,S/MIME,JAR/XPI

caSigningCert cert-pki-ca                                    CTu,Cu,Cu
auditSigningCert cert-pki-ca                                 u,u,u
ocspSigningCert cert-pki-ca                                  u,u,u
Server-Cert cert-pki-ca                                      u,u,u
subsystemCert cert-pki-ca                                    u,u,u

On the working IPA replica:

$ sudo certutil -L -d /etc/pki/pki-tomcat/alias

Certificate Nickname                                         Trust Attributes
                                                             SSL,S/MIME,JAR/XPI

caSigningCert cert-pki-ca                                    CTu,Cu,Cu
Server-Cert cert-pki-ca                                      u,u,u
auditSigningCert cert-pki-ca                                 u,u,Pu
ocspSigningCert cert-pki-ca                                  u,u,u
subsystemCert cert-pki-ca                                    u,u,u

Note the P trust attribute in the latter. What does it mean? From man certutil:

-t trustargs
           Specify the trust attributes to modify in an existing certificate
           or to apply to a certificate when creating it or adding it to a
           database. There are three available trust categories for each
           certificate, expressed in the order SSL, email, object signing for
           each trust setting. In each category position, use none, any, or
           all of the attribute codes:

           ·   p - Valid peer

           ·   P - Trusted peer (implies p)

           ·   c - Valid CA

           ·   C - Trusted CA (implies c)

           ·   T - trusted CA for client authentication (ssl server only)

I modified the trust attributes of the certificate accordingly:

$ sudo certutil -M -t ',,P' -d /etc/pki/pki-tomcat/alias -n 'auditSigningCert cert-pki-ca'

I tried restarting FreeIPA again:

$ sudo ipactl restart
Stopping pki-tomcatd Service
Restarting Directory Service
Restarting krb5kdc Service
Restarting kadmin Service
Restarting httpd Service
Restarting ipa-custodia Service
Restarting ntpd Service
Restarting pki-tomcatd Service
Restarting ipa-otpd Service
ipa: INFO: The ipactl command was successful

It worked!

But why? What does the trust attribute for JAR/XPI mean? I don’t really know — I suppose it means that that the Java code we’re running should trust the certificate. Since I didn’t have this problem when I upgraded the working replica, I’m guessing that I must have done something to change it (and break it) along the way. It likely had nothing to do with the CentOS updates I applied, but I just happened to run into the problem after restarting FreeIPA post-updates.

Guest SSID surprises on home wireless router

My current home Internet provider is CenturyLink, and with that I’m using their recommended Zyxel C1100Z “modem”.

Via the modem’s web interface you can configure up to 4 SSIDs. I have one set up for my devices with strong security settings, and another set up for guests with weaker security settings. One thing that surprised me: when I checked the list of attached devices, devices attached to the guest SSID were allocated IP addresses in the same address range as, and could communicate with, devices attached to my trusted home SSID.

The Zyxel C1100Z will let you create LAN subnets with different IP address ranges and settings, but a device on one subnet can still communicate with devices on another LAN subnet. This would let you at least configure a host firewall (on hosts that support a host firewall) to drop traffic from a particular address range (e.g. 192.168.100.0/24).

This is lunacy, though. Why would you create separate SSIDs with different security settings if the attached devices cannot be isolated from one another? I suspect that most users do not realize this. There are some settings you can change from one SSID to another, such as bandwidth throttling, but that seems like a secondary consideration to securing your network. Needless to say, my guest network has the same security settings as my trusted home network now.

I wondered if I had overlooked a setting somewhere, so I called to confirm with CenturyLink. The technician there was able to identify the SSIDs I had configured, suggesting that they have a backdoor into the modem they provided.

The moral of the story is: never use the equipment provided by your ISP.

FreeIPA 4.5.0 upgrade fails

I recently ran the usual sudo yum update, which included an upgrade of FreeIPA from version 4.4 to version 4.5. However, the upgrade reported that it failed, so I tried to run it manually afterwards:

$ sudo ipa-server-upgrade

The command above timed out, so I ran it again in verbose mode:

$ sudo ipa-server-upgrade --verbose

With the extra output, I saw it was getting stuck on the following before timing out:

ipa: DEBUG: wait_for_open_ports: localhost [8080, 8443] timeout 300

Neither 8080 nor 8443 are on the list of FreeIPA protocols/ports:

TCP 80, 443: HTTP/HTTPS TCP 389, 636: LDAP/LDAPS TCP 88, 464: kerberos TCP 53: DNS UDP 88, 464: kerberos UDP 53: DNS UDP 123: NTP

So what was going on?

I checked to see if those ports were listening:

$ sudo lsof -i TCP -P
...
TCP *:8080 (LISTEN)
TCP *:8443 (LISTEN)
...

$ sudo lsof -i TCP
...
TCP *:webcache (LISTEN)
TCP *:pcsync-https (LISTEN)
...

The ports were listening, so why the timeout? Fortunately I found someone else describing a similar problem on the FreeIPA mailing list, with this reply from Alexander Bokovoy, one of the FreeIPA developers:

I’m a bit tired to repeat this multiple times but FreeIPA does require IPv6 stack to be enabled in the kernel. We absolutely do. If you don’t use IPv6 stack, disable it on specific interfaces. However, there is a practical problem with the way how glibc DNS resolver works: in default configuration it always prefers IPv6 answers to IPv4 because this is actually a policy of RFC3484. As result, if you have ::1 in /etc/hosts, it will be returned first. If you don’t have ::1 on any of your interfaces (‘lo’ is a typical one), then apps cannot contact ::1 (localhost) even if those apps that use IPv6 bind to all interfaces.

FreeIPA uses modern APIs provided by glibc to listen on both IPv6 and IPv4. It simply means that FreeIPA servers bind to IPv6 addresses (on all interfaces or on a specific one, if needed) and treat IPv4 as mapped ones because IPv6 and IPv4 share the same port space on the same machine. This works transparently thanks to glibc and is a recommended way to write networking applications. See man ipv6(7) for details.

(Source)

Following that suggestion, I deleted the following line from /etc/hosts:

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

Sure enough, that solved the problem! This seems like a good idea for anyone who has disabled IPv6, even if you’re not running FreeIPA. Why include an entry in your hosts file that you know doesn’t work?

I had another FreeIPA server, this one running on an underpowered VM (1GB RAM, 1 CPU). Even after removing the IPv6 entry from /etc/hosts, the FreeIPA upgrade seemed to fail. The certmonger/dogtag processes would consume all of the system resources and it would freeze. All memory was in use and nearly all of the swap space as well. The CPU was running at 5000%. I gave up on it and ignored it for a while, and it turned out that letting it sit and process for a while helped. It seem to have worked itself out: FreeIPA is at the latest version and system resource utilization is very low.

Using blocklist.de with fail2ban

Anyone who runs a server with open ports knows that systems with questionable intent will come knocking. Sometimes thousands of them. fail2ban is software that that checks your server logs and detects multiple failures, for example 5 failed SSH logins in a row, and bans the source IP address a period of time, e.g. for an hour. This helps prevent password-guessing and brute force attacks. It might be useful to share information about those questionable IP addresses with others so that we can block them proactively.

One such list of IP addresses that I found is blocklist.de. Since I am primarily concerned with systems that are trying to SSH into my system, I looked specifically at their SSH blocklist:
All IP addresses which have been reported within the last 48 hours as having run attacks on the service SSH.

Implementation details: Continue reading Using blocklist.de with fail2ban

Wifi on Raspberry Pi 3

I don’t run a GUI/Desktop on my Raspberry Pi devices. I don’t have monitors or keyboards connected to them — typically I log into them via SSH and manage them that way. I recently wanted to activate multiple network connections on my Raspberry Pi 3 Model B, so I decided to activate the wireless connection in addition to the wired connection that was already configured.

I tried following the steps at Setting WiFi up via the command line but got an error after the first step:

$ sudo iwlist wlan0 scan
wlan0     Interface doesn't support scanning : Network is down

I decided to check an see what devices were available via ifconfig:

$ ifconfig
eth0 ...
lo ...

No wireless device is listed there. I checked for all devices using the -a switch:

$ ifconfig -a
eth0 ...
lo ...
wlan0 ...

It’s there, wlan0, but it wasn’t active. I tried to bring the device up:

$ sudo ifconfig wlan0 up
SIOCSIFFLAGS: Operation not possible due to RF-kill

That error message is completely indecipherable to me! Fortunately, someone else had this error message too:
“SIOCSIFFLAGS: Operation not possible due to RF-kill”?

From that post I was able to determine that the wireless device was soft blocked:

$ sudo rfkill list
0: phy0: Wireless LAN
        Soft blocked: yes
        Hard blocked: no
1: hci0: Bluetooth
        Soft blocked: yes
        Hard blocked: no

How to unblock it? The following, described at the aforementioned post, worked. Although I’m not sure why I’m unblocking wifi instead of phy0:

$ sudo rfkill unblock wifi
$ sudo rfkill list
0: phy0: Wireless LAN
        Soft blocked: no
        Hard blocked: no
1: hci0: Bluetooth
        Soft blocked: yes
        Hard blocked: no

$ sudo ifconfig wlan0 up
$ ifconfig
eth0 ...
lo ...
wlan0 ...

The device is now active, but we still need to connect. First, I generated an encrypted passphrase:

$ wpa_passphrase "my_network_id" "my_network_password"
network={
        ssid="my_network_id"
        psk=8a9b456b28ef0707987622421592d3cc2fd22544ac281bced0f2028f4f4fcb85
}

Next, I appended that block of text to /etc/wpa_supplicant/wpa_supplicant.conf

Based on what I’ve read, that should be sufficient. The system should periodically detect changes to the wpa_supplicant.conf file and load the new settings automatically. I was impatient and rebooted. Apparently the following command should work too:

$ sudo wpa_cli reconfigure

After the reboot, I checked ifconfig for the wlan0 interface:

$ ifconfig wlan0 | grep 'inet addr'
          inet addr:192.168.0.23  Bcast:192.168.0.255  Mask:255.255.255.0

It has an address — success!

Nagios check_disk returns DISK CRITICAL – /sys/kernel/config is not accessible: Permission denied

I enabled Nagios checks for free disk space on a group of servers today, and was hit with alerts containing the following error message:
DISK CRITICAL - /sys/kernel/config is not accessible: Permission denied

If you are looking for a solution, skip to the end. Some of my mistakes before finding the solution may be interesting though!

Continue reading Nagios check_disk returns DISK CRITICAL – /sys/kernel/config is not accessible: Permission denied

Ansible: [Errno 2] No such file or directory

I tried running a command on several remote servers at once via Ansible:

$ ansible -a 'rpcinfo -p' centos

Which returned a series of errors:

ariel.osric.net | FAILED | rc=2 >> [Errno 2] No such file or directory


caliban.osric.net | FAILED | rc=2 >>

[Errno 2] No such file or directory

trinculo.osric.net | FAILED | rc=2 >> [Errno 2] No such file or directory

I also received an error when I tried running it via ssh:

$ ssh ariel.osric.net 'rpcinfo -p' bash: rpcinfo: command not found

I can run it interactively on a specific host:

$ ssh ariel.osric.net $ rpcinfo -p program vers proto port service 100000 4 tcp 111 portmapper 100000 3 tcp 111 portmapper 100000 2 tcp 111 portmapper 100000 4 udp 111 portmapper 100000 3 udp 111 portmapper 100000 2 udp 111 portmapper

The problem is that the user profile isn’t loaded when running via ansible or a non-interactive ssh session. rpcinfo isn’t found in the PATH. In the next step I identify the full path:

$ ssh ariel.osric.net $ whereis rpcinfo rpcinfo: /usr/sbin/rpcinfo /usr/share/man/man8/rpcinfo.8.gz

(/usr/sbin is added to the path via /etc/profile)

Once I specified the full path, it worked:

$ ansible -a '/usr/sbin/rpcinfo -p' centos

ariel.osric.net | SUCCESS | rc=0 >> program vers proto port service 100000 4 tcp 111 portmapper 100000 3 tcp 111 portmapper 100000 2 tcp 111 portmapper 100000 4 udp 111 portmapper 100000 3 udp 111 portmapper 100000 2 udp 111 portmapper

Etc.

Using fail2ban with iptables instead of firewalld

In the previous post I wrote about the minor configuration changes needed to get fail2ban to actually do something.

I have been working primarily with CentOS 7 and have been using iptables instead of firewalld. Normally, fail2ban works with iptables by default. However, installing fail2ban on CentOS 7 also installs fail2ban-firewalld — which changes that default. Even with a properly configured fail2ban jail, you will not see the expected results. fail2ban will log events as expected, but no traffic will actually be banned.

The fail2ban-firewalld package places a file in /etc/fail2ban/jail.d/00-firewalld.conf. It overrides the default banaction (iptables) and sets it to firewallcmd-ipset.

The top of the 00-firewalld.conf file says:

You can remove this package (along with the empty fail2ban meta-package) if you do not use firewalld

When I tried removing fail2ban-firewalld, it removed fail2ban as a dependency. I have a feeling the referenced fail2ban meta-package may have something to so with that.

I have not yet investigated the meta-package and de-coupling fail2ban-firewalld from fail2ban (see Update below). My solution, for now, has been to move 00-firewalld.conf and restart fail2ban:

$ sudo mv /etc/fail2ban/jail.d/00-firewalld.conf /etc/fail2ban/jail.d/00-firewalld.disabled $ sudo systemctl restart fail2ban

The default banaction defined in jail.conf is no longer overridden and performs as expected:
banaction = iptables-multiport

Update
According to Fail2ban with FirewallD, The fail2ban package itself is a meta-package that contains several other packages, including fail2ban-firewalld and fail2ban-server. Removing the meta-package will not remove fail2ban-server.

If you’ve already moved 00-firewalld.conf to 00-firewalld.disabled, you’ll get a warning:
warning: file /etc/fail2ban/jail.d/00-firewalld.conf: remove failed: No such file or directory

You can ignore the warning, or remove 00-firewalld.disabled.