Steven Jewel Blog RSS Feed

08 Mar 2014

Puppet-Undo and Puppet-Audit

Configuration management, such as Puppet, Chef, and Ansible, are a great step forward for system administration. My main exposure is to Puppet, so I will write using its terms, but I believe the concepts I discuss apply to other systems.

Setting up a new system with Puppet is simple, but if a module is removed from a system, extra work is needed to undo all the changes that were made. For example, a package might say:

module webserver {
  package {
    apache2:
      ensure => installed;
  }
}

To remove apache later, perhaps because you're switching to nginx, you'd need to additionally remove apache:

module webserver {
  package {
    nginx:
      ensure => installed;
    apache2:
      ensure => absent;
  }
}

At best, this is an annoyance and extra work. As modules get more complicated it can cause machines that should be configured identically to drift out of sync over time, causing unexpected problems.

My proposal is that Puppet should remember all actions it has previously taken. When a new manifest comes in that no longer asks for an action, it should reverse the action. This means each "type" in Puppet will need to define an undo action.

Taking it to the next level

An even better system would be to have an audit tool that works in conjunction with Puppet, meaning that the system's manifest declares everything that should be present on a system. It wouldn't remove offenders, but would instead print out a report of them.

To make this work, it is important to look at the source of files on a typical server:

The tool would look at all files on the system, and report anything that didn't ultimately come from Puppet. For example, if a package was installed but Puppet doesn't call for it, the package would be listed. The audit tool would be smart enough to not list all the files installed by that package as well. A whitelist can be added to each Puppet module as an array of globs of files that are expected to exist because of the module.

This would let you take a server that isn't managed by Puppet and add rules until it is completely managed by Puppet. It would also let you catch manual changes that are made to a server after it was supposedly being completely managed by Puppet.

Caveats

I realize this proposal addresses a problem that a lot of configuration management users don't have. When you are working with virtual machines on EC2, for example, it is common to deploy them from scratch with each reboot, so undo would be nice but not a necessity, and an audit isn't ever needed.

In my experience with smaller businesses, there can be some physical, on-premise servers that need to be managed where periodic redeploys aren't practical. Also, there may be dozens of one-off virtual servers (i.e. snowflake servers) that aren't important enough to be standardized, but where having an audit tool would make it easier to create trustworthy Puppet rules for them so that they can be redeployed from scratch if something goes wrong.

Comments or corrections to blogcomment@ this domain.