FreeIPA: Failed to start pki-tomcatd Service

After a recent CentOS update, FreeIPA 4.5 failed to start with the following error message:
Failed to start pki-tomcatd Service

What changed? The following were the 3 packages updated:

  • httpd.x86_64
  • httpd-tools.x86_64
  • mod_session.x86_64

I successfully restarted FreeIPA without the pki-tomcatd service:
$ sudo ipactl start --ignore-service-failure

But it’s not ideal to run it without the PKI service. What is going on? According to the log at /var/log/pki/pki-tomcat/ca/debug:

java.lang.Exception: Certificate auditSigningCert cert-pki-ca is invalid: Invalid certificate: (-8101) Certificate type not approved for application.

Which cert is that? Where is it? How did it get created? Didn’t FreeIPA create it? Why isn’t it valid? Why doesn’t it give me any additional info?

Eventually I found the certificate location (although I don’t recall how, likely a post on the FreeIPA mailing list):
/var/lib/pki/pki-tomcat/alias -> /etc/pki/pki-tomcat/alias

I ran certutil to find out more about the certificate:
$ certutil -L -d /etc/pki/pki-tomcat/alias
certutil: function failed: SEC_ERROR_LEGACY_DATABASE: The certificate/key database is in an old, unsupported format.

That uninformative and misleading error message looked familiar to me. Indeed, I wrote a post about it 7 months ago:
certutil: function failed: SEC_ERROR_LEGACY_DATABASE: The certificate/key database is in an old, unsupported format

$ sudo certutil -L -d /etc/pki/pki-tomcat/alias -n 'auditSigningCert cert-pki-ca'

The expiration date looked fine, which was the first thing I suspected.

I did note the following, which looked interesting:
Mozilla-CA-Policy: false (attribute missing)

But after reading about that at http://mozilla.github.io/ca-policy/ it looked like it shouldn’t be needed.

Fortunately, I have another working FreeIPA replica that I had not yet upgraded, so I compared the certificates on both systems:

On the IPA replica with errors:

$ sudo certutil -L -d /etc/pki/pki-tomcat/alias

Certificate Nickname                                         Trust Attributes
                                                             SSL,S/MIME,JAR/XPI

caSigningCert cert-pki-ca                                    CTu,Cu,Cu
auditSigningCert cert-pki-ca                                 u,u,u
ocspSigningCert cert-pki-ca                                  u,u,u
Server-Cert cert-pki-ca                                      u,u,u
subsystemCert cert-pki-ca                                    u,u,u

On the working IPA replica:

$ sudo certutil -L -d /etc/pki/pki-tomcat/alias

Certificate Nickname                                         Trust Attributes
                                                             SSL,S/MIME,JAR/XPI

caSigningCert cert-pki-ca                                    CTu,Cu,Cu
Server-Cert cert-pki-ca                                      u,u,u
auditSigningCert cert-pki-ca                                 u,u,Pu
ocspSigningCert cert-pki-ca                                  u,u,u
subsystemCert cert-pki-ca                                    u,u,u

Note the P trust attribute in the latter. What does it mean? From man certutil:

-t trustargs
           Specify the trust attributes to modify in an existing certificate
           or to apply to a certificate when creating it or adding it to a
           database. There are three available trust categories for each
           certificate, expressed in the order SSL, email, object signing for
           each trust setting. In each category position, use none, any, or
           all of the attribute codes:

           ·   p - Valid peer

           ·   P - Trusted peer (implies p)

           ·   c - Valid CA

           ·   C - Trusted CA (implies c)

           ·   T - trusted CA for client authentication (ssl server only)

I modified the trust attributes of the certificate accordingly:

$ sudo certutil -M -t ',,P' -d /etc/pki/pki-tomcat/alias -n 'auditSigningCert cert-pki-ca'

I tried restarting FreeIPA again:

$ sudo ipactl restart
Stopping pki-tomcatd Service
Restarting Directory Service
Restarting krb5kdc Service
Restarting kadmin Service
Restarting httpd Service
Restarting ipa-custodia Service
Restarting ntpd Service
Restarting pki-tomcatd Service
Restarting ipa-otpd Service
ipa: INFO: The ipactl command was successful

It worked!

But why? What does the trust attribute for JAR/XPI mean? I don’t really know — I suppose it means that that the Java code we’re running should trust the certificate. Since I didn’t have this problem when I upgraded the working replica, I’m guessing that I must have done something to change it (and break it) along the way. It likely had nothing to do with the CentOS updates I applied, but I just happened to run into the problem after restarting FreeIPA post-updates.

FreeIPA 4.5.0 upgrade fails

I recently ran the usual sudo yum update, which included an upgrade of FreeIPA from version 4.4 to version 4.5. However, the upgrade reported that it failed, so I tried to run it manually afterwards:

$ sudo ipa-server-upgrade

The command above timed out, so I ran it again in verbose mode:

$ sudo ipa-server-upgrade --verbose

With the extra output, I saw it was getting stuck on the following before timing out:

ipa: DEBUG: wait_for_open_ports: localhost [8080, 8443] timeout 300

Neither 8080 nor 8443 are on the list of FreeIPA protocols/ports:

TCP 80, 443: HTTP/HTTPS
TCP 389, 636: LDAP/LDAPS
TCP 88, 464: kerberos
TCP 53: DNS
UDP 88, 464: kerberos
UDP 53: DNS
UDP 123: NTP

So what was going on?

I checked to see if those ports were listening:

$ sudo lsof -i TCP -P
...
TCP *:8080 (LISTEN)
TCP *:8443 (LISTEN)
...

$ sudo lsof -i TCP
...
TCP *:webcache (LISTEN)
TCP *:pcsync-https (LISTEN)
...

The ports were listening, so why the timeout? Fortunately I found someone else describing a similar problem on the FreeIPA mailing list, with this reply from Alexander Bokovoy, one of the FreeIPA developers:

I’m a bit tired to repeat this multiple times but FreeIPA does require IPv6 stack to be enabled in the kernel. We absolutely do. If you don’t use IPv6 stack, disable it on specific interfaces. However, there is a practical problem with the way how glibc DNS resolver works: in default configuration it always prefers IPv6 answers to IPv4 because this is actually a policy of RFC3484. As result, if you have ::1 in /etc/hosts, it will be returned first. If you don’t have ::1 on any of your interfaces (‘lo’ is a typical one), then apps cannot contact ::1 (localhost) even if those apps that use IPv6 bind to all interfaces.

FreeIPA uses modern APIs provided by glibc to listen on both IPv6 and IPv4. It simply means that FreeIPA servers bind to IPv6 addresses (on all interfaces or on a specific one, if needed) and treat IPv4 as mapped ones because IPv6 and IPv4 share the same port space on the same machine. This works transparently thanks to glibc and is a recommended way to write networking applications. See man ipv6(7) for details.

(Source)

Following that suggestion, I deleted the following line from /etc/hosts:

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

Sure enough, that solved the problem! This seems like a good idea for anyone who has disabled IPv6, even if you’re not running FreeIPA. Why include an entry in your hosts file that you know doesn’t work?

I had another FreeIPA server, this one running on an underpowered VM (1GB RAM, 1 CPU). Even after removing the IPv6 entry from /etc/hosts, the FreeIPA upgrade seemed to fail. The certmonger/dogtag processes would consume all of the system resources and it would freeze. All memory was in use and nearly all of the swap space as well. The CPU was running at 5000%. I gave up on it and ignored it for a while, and it turned out that letting it sit and process for a while helped. It seem to have worked itself out: FreeIPA is at the latest version and system resource utilization is very low.

Nagios check_disk returns DISK CRITICAL – /sys/kernel/config is not accessible: Permission denied

I enabled Nagios checks for free disk space on a group of servers today, and was hit with alerts containing the following error message:
DISK CRITICAL - /sys/kernel/config is not accessible: Permission denied

If you are looking for a solution, skip to the end. Some of my mistakes before finding the solution may be interesting though!

Continue reading Nagios check_disk returns DISK CRITICAL – /sys/kernel/config is not accessible: Permission denied

Let’s Encrypt: certbot error “No vhost exists with servername or alias of”

It’s about time–or rather, years past time–I enabled HTTPS for this site. I decided to try Let’s Encrypt. It wasn’t as turnkey as I expected, so I’ve included some notes here in case anyone else has similar issues.

The Let’s Encrypt site suggested installing Certbot and included specific instructions for using Certbot with Apache on CentOS 7. It suggested that a single command might do the trick:

$ sudo certbot --apache

Unfortunately, I received a couple error messages and it was ultimately able to create the certificate for me, but unable to update my Apache configuration. An excerpt of the output of the certbot command is below:

Saving debug log to /var/log/letsencrypt/letsencrypt.log
No names were found in your configuration files. Please enter in your domain
name(s) (comma and/or space separated) (Enter 'c' to cancel):osric.com,www.osric.com
...
No vhost exists with servername or alias of: osric.com (or it's in a file with multiple vhosts, which Certbot can't parse yet). No vhost was selected. Please specify ServerName or ServerAlias in the Apache config, or split vhosts into separate files.
Falling back to default vhost *:443...
No vhost exists with servername or alias of: www.osric.com (or it's in a file with multiple vhosts, which Certbot can't parse yet). No vhost was selected. Please specify ServerName or ServerAlias in the Apache config, or split vhosts into separate files.
Falling back to default vhost *:443...
...
No vhost selected

IMPORTANT NOTES:
- Unable to install the certificate
...

I’m guessing it’s because my Apache virtual host configuration is in /etc/httpd/conf/vhosts/chris/osric.com instead of the expected location.

I looked at the certbot documentation hoping to find a way I could pass the certbot command the path to my virtual host configuration file. I did not find an option to do that. The logs at /var/log/letsencrypt/letsencrypt.log are fairly verbose, but it still does not indicate what files or directories it looked at to attempt to find my Apache configuration.

I noted that /etc/letsencrypt/options-ssl-apache.conf contains Apache directives. I thought maybe I could just include it in my config file using Apache’s Include directive, e.g.:

Include /etc/letsencrypt/options-ssl-apache.conf

I restarted Apache using systemctl (I know, I should be using apachectl restart instead):

$ sudo systemctl restart httpd
Job for httpd.service failed because the control process exited with error code. See "systemctl status httpd.service" and "journalctl -xe" for details.

Two problems there. One, options-ssl-apache.conf appears to be a generic file with no data specific to the host or cert. Additionally, I had just added it to a VirtualHost directive listening on port 80.

I duplicated the VirtualHost directive in my config file at /etc/httpd/conf/vhosts/chris/osric.com and made a few modifications and additions:

<IfModule mod_ssl.c>
<VirtualHost 216.154.220.53:443>
...all the directives from the port 80 VirtualHost...
SSLEngine on
SSLCertificateFile /etc/letsencrypt/live/osric.com/cert.pem
SSLCertificateKeyFile /etc/letsencrypt/live/osric.com/privkey.pem
SSLCertificateChainFile /etc/letsencrypt/live/osric.com/chain.pem
</VirtualHost>
</IfModule>

I restarted Apache:

$ sudo apachectl restart

The server restarted, but still did not respond to HTTPS requests. It didn’t appear to be listening on 443:

$ curl https://www.osric.com
curl: (7) Failed connect to www.osric.com:443; Connection refused

As a sanity check, I confirmed that mod_ssl was indeed installed:

$ yum list mod_ssl
Installed Packages
mod_ssl.x86_64 1:2.4.6-45.el7.centos @base

And I checked to confirm that Apache was loading mod_ssl:

$ cat /etc/httpd/conf.modules.d/00-ssl.conf
LoadModule ssl_module modules/mod_ssl.so

I looked at some other Apache configurations where I knew SSL was working and I noted the Listen directive:

Listen 443

I added that line to the top of my configuration file at /etc/httpd/conf/vhosts/chris/osric.com, above the VirtualHost directive. I restarted Apache and it worked!

FreeIPA: updating client hostname

I recently updated some CentOS 7 hostnames to better reflect their status as cattle, not pets. Part of renaming the hosts meant updating the hosts in my FreeIPA environment. RedHat’s Identity Management Guide to Renaming Machines confirms there’s no easy way to update a hostname. You need to un-enroll the and re-enroll the client.

Un-enroll:
# ipa-client-install --uninstall

Re-enroll:
# ipa-client-install --domain=osric.net --server=freeipa.osric.net --realm=FREEIPA.OSRIC.NET --principal=admin --password=T0ps3CR3T --mkhomedir -U --hostname=www-dev-01.osric.net

Error:
Kerberos authentication failed: kinit: Cannot read password while getting initial credentials

I searched for the error and found a blog post suggesting that the password had expired. Sure enough, when I checked the FreeIPA web interface, it showed that the password for the admin user had expired. I reset it via the web interface.

I tried again, using the new password:
# ipa-client-install --domain=osric.net --server=freeipa.osric.net --realm=FREEIPA.OSRIC.NET --principal=admin --password=M0r3s3CR3Ts! --mkhomedir -U --hostname=www-dev-01.osric.net

It failed with the same error message!

When I checked /var/log/ipaclient-install.log it indicated that the password was still expired. Resetting the password via the web interface forces the user to set a new password at the next login — the password expires immediately!

I ran kinit admin on the command line and used the temporary password to log in and set a new password. Then the command to re-enroll the server worked without any errors.

certutil: function failed: SEC_ERROR_LEGACY_DATABASE: The certificate/key database is in an old, unsupported format.

I was attempting to view the certificate for my FreeIPA server:

$ certutil -L -n 'IPA CA' -d /etc/dirsrv/slapd-FREEIPA-OSRIC-NET/
certutil: function failed: SEC_ERROR_LEGACY_DATABASE: The certificate/key database is in an old, unsupported format.

That had me worried. Was my cert/key database corrupt? Turns out, I just didn’t have permission to read the files. It worked when I tried it with sudo:

$ sudo certutil -L -n 'IPA CA' -d /etc/dirsrv/slapd-FREEIPA-OSRIC-NET/

That produced the expected output.

The old, unsupported format error is produced in a variety of cases and is often not helpful or informative. Permissions are just one reason why you might run into this message. Other reasons I’ve found include specifying a directory that does not contain the expected cert database files (i.e. cert8.db, key3.db, and secmod.db), or specifying a directory that does not exist.

VirtualBox static IP address on a host-only network

I have a number of CentOS 7 servers that comprise a FreeIPA domain on a VirtualBox host-only network. Whenever I start a server though, it is liable to pick up an IP address that I’ve already assigned to another server (which is currently powered off) in /etc/hosts.

How do I assign it a specific static IP address?

In CentOS 7, you can use the Network Manager Text User Interface (nmtui) to edit the network settings. Here’s the first thing I tried, which wasn’t quite right:

# nmtui

  • Edit a connection
  • Select a connection, e.g. enp0s3
  • IPv4 Configuration
  • Change from Automatic to Manual
  • Select Show
  • Enter 192.168.56.109/32 for addresses
  • Enter 192.168.56.255 for the gateway

When I used those settings, it didn’t work. No route to host, etc. I looked at the network interface settings via a different method:

# ip addr show

The brd (broadcast) address listed was the same as my ip address, 192.168.56.109, which was unexpected and probably why it wasn’t working!

I ran nmtui again and changed the address from 192.168.56.109/32 to 192.168.56.109/24 and it worked.

Since the /32 is interpreted as the subnet mask, it created a subnet with an address range of 1, and the broadcast address would be the same as the ip address. Specifying a subnet mask of /24 creates a subnet with 256 addresses, and a broadcast address of 192.168.56.255 (the same as was listed for the other machines on the virtual network that were using DHCP).

VMWare VSphere CLI vmware-cmd and the cfg parameter

I have a VMWare ESXi host. I can manage it through VMWare Fusion, although the options seem limited (I’m used to using vCenter Server, but I don’t have the license for that in this environment). I thought I’d give the VMWare vSphere Command Line Interface (CLI) a try. This was a mistake, but if you insist on following me down the same path, see the Drivers and Tools section on the VMWare vSphere Downloads page to get started.

First I tried vmware-cmd.

C:\Program Files (x86)\VMware\VMware vSphere CLI>vmware-cmd
'vmware-cmd' is not recognized as an internal or external command,
operable program or batch file.

The actual file is vmware-cmd.pl (it’s in the bin folder).

I was able to run one command, to list the virtual machines on the host:
C:\Program Files (x86)\VMware\VMware vSphere CLI>vmware-cmd.pl -H esxi.osric.net -l
Enter username: chris
Enter password:

The documentation I was looking at was probably outdated, as the newer documentation gives better examples. But the version I was looking at indicated that most of the other commands require a <config_file_path> or <cfg> parameter. Unfortunately, it does not specify what those values consist of or what they might look like. There was a hint in the docs in vmware-cmd Overview:

vmware-cmd is a legacy tool and supports the usage of VMFS paths for virtual machine configuration files. As a rule, use datastore paths to access virtual machine configuration files.

It appears that <cfg> is the path to the VMX. There are several different ways to specify this:

Full path using GUID
C:\Program Files (x86)\VMware\VMware vSphere CLI>vmware-cmd.pl -H esxi.osric.net /vmfs/volumes/272c880d-a89548c1-a530-4bccbbad9507/benvolio/benvolio.vmx uptime
Enter username: chris
Enter password:
getuptime() = 7193

(The GUID is displayed in the output of the list of virtual machines from vmware-cmd.pl -l.)

Full path using Datastore Name
C:\Program Files (x86)\VMware\VMware vSphere CLI>vmware-cmd.pl -H esxi.osric.net "/vmfs/volumes/test vms/benvolio/benvolio.vmx" uptime
Enter username: chris
Enter password:
getuptime() = 7578

Datastore Name + relative path
C:\Program Files (x86)\VMware\VMware vSphere CLI>vmware-cmd.pl -H esxi.osric.net "[test vms] benvolio/benvolio.vmx" uptime
Enter username: chris
Enter password:
getuptime() = 7822

Entering my username and password every time is tedious though. According to the Connection Options for vmware-cmd:

The vmware-cmd vCLI command supports only a specific set of connection options. Other vCLI connection options are not supported, for example, you cannot use variables because the corresponding option is not supported.

In this case, I have the vSphere CLI installed on a password-protected Windows 2012r2 virtual machine, so I didn’t feel it was too much of a risk to set a temporary environment variable to store some of the connection options:

C:\Program Files (x86)\VMware\VMware vSphere CLI>SET VMOPTIONS=-H esxi.osric.net -U chris -P t0u6hpa55w0rd
C:\Program Files (x86)\VMware\VMware vSphere CLI>vmware-cmd.pl %VMOPTIONS% -l

Remember how the documentation said that “vmware-cmd is a legacy tool”?

I’m not sure what the official replacement is–possibly the PowerShell-based VMWare vSphere PowerCLI–but it turns out that the vSphere Client is free. Accessing your ESXi host via HTTPS should provide a link to download the installer. The vSphere Client does not appear to be something you can script against or automate, but for simple tasks it is much easier to use than vmware-cmd.pl.

FreeIPA: Could not chdir to home directory /home/bbilliards: no such file or directory

I recently installed a FreeIPA server and a FreeIPA client. I generated a Kerberos ticket for a test user, Bob Billiards, on the IPA server:

# kinit bbilliards
Password for bbilliards@IPA.OSRIC.NET:

Then I attempted to ssh into the IPA client as that user. The connection was successful, but it could not find the user’s home directory:

# ssh bbilliards@ariel.osric.net
bbilliards@ariel.osric.net's password:
Could not chdir to home directory /home/bbilliards: no such file or directory

The location of the home directory was set when I created the user, as can be seen here:

# ipa user-find bbilliards
--------------
1 user matched
--------------
  User login: bbilliards
  First name: Bob
  Last name: Billiards
  Home directory: /home/bbilliards
  Login shell: /bin/sh
  Principal name: bbilliards@IPA.OSRIC.NET
  Principal alias: bbilliards@IPA.OSRIC.NET
  Email address: bbilliards@ipa.osric.net
  UID: 1110200001
  GID: 1110200001
  SSH public key fingerprint: [redacted]
  Account disabled: False
----------------------------
Number of entries returned 1
----------------------------

Shouldn’t the system be able to create the home directory automatically? It turns out it can, if you specify the --mkhomedir switch when installing the IPA client:

# ipa-client-install --mkhomedir

Now when I ssh into the machine it creates a home directory:

# ssh bbilliards@ariel.osric.net
Creating home directory for bbilliards
-sh-4.2$ pwd
/home/bbilliards

You may prefer to mount a Network File System (NFS) directory as a home directory instead so that users have the same home directories across machines.

Error: Cannot contact any KDC for realm while getting initial credentials

I’ve been testing FreeIPA on a small network of CentOS 7 hosts (all virtual machines running in VirtualBox on a host-only network). After installing the IPA server on one host and creating the realm (IPA.OSRIC.NET), I installed the IPA client on one of the other hosts and tried running kinit:

# kinit admin
kinit: Cannot contact any KDC for realm 'IPA.OSRIC.NET' while getting initial credentials

Searching for that error brought me to Kinit won’t connect to a domain server. Although that did not describe the same issue, it did point me to the /etc/krb5.conf file. The realms section looked like it was missing something:

[realms]
  IPA.OSRIC.NET = {
    pkinit_anchors = FILE:/etc/ipa/ca.crt

  }

I added a kdc attribute:

[realms]
  IPA.OSRIC.NET = {
    kdc = prospero.osric.net:88
    pkinit_anchors = FILE:/etc/ipa/ca.crt
 
  }

No restart of any service was necessary. I ran kinit again and it worked:

# kinit admin
Password for admin@IPA.OSRIC.NET:

According to the krb5.conf documentation on realms:

kdc
The name or address of a host running a KDC for that realm. An optional port number, separated from the hostname by a colon, may be included.

I’m a Kerberos novice, but that seems like a necessary property. I’m not sure why the IPA client setup did not include it. I have a few more virtual machines to install the client on, so I’ll soon find if that behavior is consistent on subsequent installations.