Managing KernelCare with Puppet

KernelCare

If you haven’t felt it before: When Dirty Cow hit you did. The Linux Kernel is rock solid, proven but also has security issues. In this case: Root rights for everyone! And on top of that this bug is so trivially easy to exploit (several proof-of-concepts are out there that can easily converted into a life, working gun) that you had to update your kernels. On every server. And reboot.

The last part is especially evil because a reboot will be noticed by your customers if you are not employing some high-availability setup. And in the world of web hosting this is mostly not the case. So every reboot is a downtime, costs time and money. Plus, you have to update your servers in due time and plan said downtime accordingly. But for all this to happen your distribution must build and provide you with updates first. You can’t install non-existent patches.

Enter KernelCare.

KernelCare is a product from the folks that bring you CloudLinux, which solves all of the above problems. It consists of a kernel module that loads additional kernel patches for your kernel version and applies them in real time. The daemon checks for available updates every 4 hours (via cron) and patches are made available blazingly fast. To pick up the above Dirty Cow example, here is their incident reaction chart. To sum it up: You are days ahead. In a situation where remote root exploits is a thing, days can kill you.

Let’s rather kill the bugs.


According to their official documentation the right way to install KernelCare is:

rpm -i https://downloads.kernelcare.com/kernelcare-latest.x86_64.rpm
/usr/bin/kcarectl --register KEY

With only two commands per server and some supplying of keys you can get up and running, no reboot required. Wait, what? Manual labor? Per Server? I am thinking hundreds of servers to patch and maintain, manual “something” is not a thing.

Let’s rather kill the bugs with style.

Enter Pupppet

Once your company (or hobby project) reaches a certain size, manually configuring servers is out of the question. You want fully automatized configuration and package management. This is where puppet comes in: You describe how your servers should be and puppet makes it happen. Puppet is a very large piece of software that is ‘easy to learn, difficult to master.’ And for everything you want it to do, you need a puppet module.

And KernelCare just screams for one.

Caveat

At this point, you know what puppet is, and you are using it. Explaining those points is outside the scope of this post and would blow this out of proportion. So if you have a running puppet setup, read on!

The Goal

The puppet module for KernelCare, short kcare, should handle the usual: Importing the GPG Keys for the public repository for KernelCare, create the repository file, install the package from the newly created repository (and verify it by GPG Key) and register the instance with a supplied key. Oh, and we want monitoring, lots of it. Also: Uninstalling! And while uninstalling it must de-register the system. All that without any manual intervention.

Overview

Feel free to skip this section if you don’t care about the guts of the puppet module. It is not required for installation. Just skip ahead to “Running“. The class starts of trivial with

class al_kcare (

  # Disabled by default.
  $enabled = false,

  # The License Key should be retrieved via encrypted hiera.
  $license = false,

  # If monitoring should be enabled.
  $monitoring = true

) inherits al_kcare::params {

which defines override-able variables via hiera.

  • enabled: if the class should install & register KernelCare or not.
  • license: the license key used to register your system
  • monitoring: If monitoring should be disabled. It will only enable if the class is enabled, tho.

Next, we check for  visualizations:

  case ($::virtual) {
    'physical': {
      # Physical Servers (bare metal) are no issue.
      $real_enabled = $enabled
    }

    'kvm': {
      # KVM run their own kernels, too.
      $real_enabled = $enabled
    }

    'openvzve': {
      # VZ Container are dependent on their Host Kernel, can't act.
      $real_enabled = false
    }

    'xen0': {
      # XenServer should not be touched by KernelCare.
      $real_enabled = false
    }

    'xenu': {
      # Xen Clients: Own Kernels, rock on!
      $real_enabled = $enabled
    }

    'xenhvm': {
      # Xen Clients: Own Kernels, rock on!
      $real_enabled = $enabled
    }

    default: {
      notify {"Your virtualisation environment ${::virtual} is unknown to me :/":}
      $real_enabled = false
    }
  }

The reason is rather simple: For KernelCare to work it needs to me able to modify it by loading modules. Plus, not all kernels are supported. So we need either a physical Server of a visualization that runs its own kernel inside the guest vm. We set the enabled flag forcefully to false if we are running from inside a not-supported vm. With all that, we kick the class off:

  # Install.
  if ($::os['architecture'] == 'x86_64') {
    # Only run on 64 Bit machines.

    if ($::al_kcare::canrun) {
      # We only run on supported systems.
      include ::al_kcare::repo
      include ::al_kcare::install
      include ::al_kcare::register
      include ::al_kcare::monitoring
    }

  } else {

    notice {'You enabled KernelCare on a 32bit machine. This is not supported.':}

  }

Notice the final check on the architecture; KernelCare only supports 64bit operating systems, see http://docs.kernelcare.com/. Skipping repository and gpg setup as well as installation of the package, registering a system is interesting again:

class al_kcare::register {

  if ($::al_kcare::real_enabled) {
    # We want kernelcare to run.

    if ($::al_kcare::license) {
      # and we have a license, register it.

We only attempt to register a system with the license servers if this module is marked as active and we have a license key. If we do, we run this exec():

      exec { 'register-kernelcare':
        user    => 'root',
        command => "${::al_kcare::kcare_bin} --register ${::al_kcare::license} && echo 'REGDONE' > ${::al_kcare::register_file}",
        creates => $::al_kcare::register_file,
        require => Class['al_kcare::install']
      }

This calls the kcarectl executable with –register and the (decoded) key. It’s a little bit idiotic, tho: This actually works:

kcarectl --register thiskeyisnotvalid && date
Invalid Key
Sun Nov  6 13:41:11 CET 2016

Running kcarectl with an invalid key will naturally not register it. But the exit code of this command it 0, meaning ‘success’ *sigh*. So if you supply a wrong key it will mark the server as registered and never try again. The monitoring alert will alert you that said server is running a trial license tho. Just remove the ‘/etc/sysconfig/kcare/registered’ file and re-run puppet to try again. (or disable the class and re-enable it again).

Running

To get yourself up and running with the Puppet Module you need to do only a few things. First, you need to clone the repository into your modules directory of your puppet server. Since you are already running a puppet infrastructure, you should know how this works. The repository is on github: https://github.com/christianreiss/al_kcare, clone it by:

git clone https://github.com/christianreiss/al_kcare.git

This should create a new directory named al_kcare. Move this into your modules directory, which by default is located here:

/etc/puppetlabs/code/environments/production/modules

So you should have the directory /etc/puppetlabs/code/environments/production/modules/al_kcare. Got it? Good. Now add the module to your setup/servers. This is a step that differs from each setup you employed. Some work with manifests, some with classes-include in hiera others with roles. But again, this is like telling you to turn the light in your room on. I can’t tell you where the switch is; I never been to your room. But chances are high you did this before. If you haven’t you are running an empty puppet server that does nothing 🙂

I will however explain by using hiera and classes-include. My hiera has this hierarchy:

:hierarchy:
  - pre
  - nodes/%{::fqdn}
  - domain/%{::domain}
  - os/%{::operatingsystem}
  - os/%{::operatingsystem}/%{::operatingsystemmajrelease}
  - common

As I want KernelCare to run on all CloudLinux and CentOS Servers, I am adding this to the os/Centos.yaml and os/CloudLinux.yaml files:

classes:
  - al_kcare
al_kcare::enabled: true

This loads the class itself and enabled it. It is disabled by default to not blindly roll it out everywhere (hello license limit).  There is also a common.yaml file that applies to all servers out there. In that file I supply my license key so I don’t have to keep this information separated in every other yaml file:

al_kcare::license: ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIB...]

As you can see I am using encrypted yaml hiera (eyaml, I highly recommend this). It is however no requirement. You can supply your valuable license key in there unencrypted. This also means that you can use different license keys by host, hostgroups, operating systems, ip’s, data center locations… Limit is your imagination.

So what happens now?

On the next puppet run on the Servers you will see this output on every(!) Server that you enabled the module for:

Info: Using configured environment 'production'
Info: Retrieving pluginfacts Info: Retrieving plugin Info: Loading facts
Info: Caching catalog for kcare-test.int.alpha-labs.net
Info: Applying configuration version '1478434291'
Notice: /Stage[main]/Al_kcare::Repo/Yumrepo[kernelcare]/ensure: created Info: changing mode of /etc/yum.repos.d/kernelcare.repo from 600 to 644
Notice: /Stage[main]/Al_kcare::Install/Package[kernelcare]/ensure: created
Notice: /Stage[main]/Al_kcare::Register/Exec[register-kernelcare]/returns: executed successfully
Notice: /Stage[main]/Al_kcare::Monitoring/File[/usr/lib64/nagios/plugins/contrib.d/check_kcare.sh]/ensure: defined content as '{md5}467019b31498b548881e4e9d0984cbd0'
Notice: Applied catalog in 12.54 seconds

Wow. Let’s walk through it:

  1. The first 6 Info lines are the puppet agent connecting to the puppet server, it fetches it catalog of things to do and preparing its run. This happens on every run of puppet.
  2. Al_kcare::Repo/Yumrepo[kernelcare]/ensure: created: The repository has been created. You now have a ‘/etc/yum.repos.d/kernelcare.repo’ file.
  3. The next Info line set the permissions of he repository file to sane defaults.
  4. Al_kcare::Install/Package[kernelcare]/ensure: created: Installs the kernelcare package (binaries) from the repository.
  5. Al_kcare::Register/Exec[register-kernelcare]/returns: executed successfully: Your server has been registered with the KernelCare license servers. Note: This step is omitted if you did not supply a license key via hiera.
  6. Al_kcare::Monitoring/File[/usr/lib64/nagios/plugins/contrib.d/check_kcare.sh]/ensure: The monitoring script is installed.

Depending on the state of your servers the output might vary; more or less lines and action might have been performed. The GPG Key for the KernelCare repository, for example, was already existing in my testserver, so that action was not done; it will, however, run on a server on your end.

Uninstalling is equally trivial:

Just set al_kcare::enabled: back to false (or remove/comment the line in your hiera) and run puppet again:

Info: Using configured environment 'production'
Info: Retrieving pluginfacts Info: Retrieving plugin Info: Loading facts
Info: Caching catalog for kcare-test.int.alpha-labs.net Info:
Applying configuration version '1478434224'
Notice: /Stage[main]/Al_kcare::Repo/Yumrepo[kernelcare]/ensure: removed
Notice: /Stage[main]/Al_kcare::Register/Exec[unregister-kernelcare]/returns: executed successfully
Notice: /Stage[main]/Al_kcare::Install/Package[kernelcare]/ensure: removed
Notice: /Stage[main]/Al_kcare::Monitoring/File[/usr/lib64/nagios/plugins/contrib.d/check_kcare.sh]/ensure: removed
Notice: Applied catalog in 10.90 seconds

As you can see, the entire process is reversed. If you supplied a key before and the system was registered at some point, it will be unregistered first and the binary uninstalled after. Same goes for the repository and all files installed. Your system will be in the exact same state as it was before installing the module.

A word on Monitoring

This class creates monitoring for kernelcare (for each server). It also brings some files needed for monitoring. But this class does not do monitoring by itself. You need a monitoring module for that. But since we are using native nagios types in puppet, using any module should work great. You need to make some adjustments to your monitoring server, tho.

Add his to your nrpe.conf file:

command[check_kcare]=sudo /usr/local/sbin/check_kcare.sh

Summed up

If you cloned the repository and activated it, you can now manage 10, 100 or 1000s of servers with the same amount of work: none. You simply define a set of Servers you want kernelcare to run, and it will just do that. If your needs ever change (the need to kcare fades) you can easily remove it again, with the same amount of work: none 🙂

I hope this module helps you deploying kernelcare on many, many servers; and that the option to uninstall it will remove any mental hurdles to try it in the first place.

-Christian.

This Article was featured at the official KernelCare blog at https://www.cloudlinux.com

Christian

Touched base with Linux back in 1995, got hooked up on it ever since. I am using Linux for both private and office for two decades. Working as a System Administrator at a medium sized hosting company I get in touch with all kinds of trouble. All of which can be solved with Linux. In my blog I am sharing solutions to problems that I had to search for myself in hope that someone else out there might find them useful.

Leave a Reply

Your email address will not be published. Required fields are marked *