I'll confess to being a bit late to the game on picking up puppet, but now that I've finally jumped in I'm completely hooked. Put simply, puppet is a piece of software, written in ruby, which allows machines to pull configuration information from a central "puppetmaster."
First, a little background, and an explanation of why I've fallen in love with the idea of such a system.
Why I use puppet
I currently manage a relatively small environment. I have about 15 physical servers and about a dozen xen guests. I'd long assumed that puppet - or its spiritual predecessor, cfengine - would be a poor fit in my situation. After all, I'm not managing seas of identical boxes - most of these machines have several unique aspects which they do not share with anything else.
I had assumed that all of the true commonalities would be taken care of at kickstart/jumpstart time, and that modifications to these commonalities would be few and far between. If they needed to change, I would change them manually. Not a big deal.
Except that's not how it works in practice. You just can't keep everything the same manually when you're dealing with more than one machine, and at some point you will want to change things everywhere and you will mess up. So, when I tweaked my system config to use kerberos for pam authentication instead of LDAP, I changed it on the kickstart, and I changed it everywhere I remembered - but I missed some boxes.
And you know what? I didn't even realize this until I moved this config into puppet.
It goes beyond this, though. It's not only about making sure the commonalities are preserved across machines and that changes are kept in sync. Even in situations where you really do have a unique configuration - something that only matters in one place - you very well might need to duplicate the setup later. There are so many little things that are easy to do without thinking much about - all the countless permission management and account creation and directory creation tasks that you do now, that you sure as hell won't remember in 5 years. This is especially true if you're not even working there and some other guy needs to replicate your work.
Puppet gives you the chance to codify all of this, and combined with subversion or git you actually have a change control mechanism for server state. Need to add a mail alias? Who cares if you don't think you'll need it elsewhere - put it in puppet and check it in to svn. Now you have both the recipe needed to recreate this configuration elsewhere, but also a record of the change and (if you comment in your svn commit) the reason why it was changed.
The puppet language itself is so concise that it's easy to see what you've done, even if you failed to document it anywhere. In effect, the mere act of making a change now becomes documentation. That's incredibly powerful.
As well, puppet often forces you to do things the right way. Puppet is really good at managing things - as long as you do the right things. A prime example here is in package management - puppet can easily ensure that you have the appropriate RPM (or sun pkg, or debian apt, or gentoo emerge, etc) packages installed as defined in your puppet configuration. Simply add the definition to puppet, and the package will be installed.
Now, this is great, until you run into a piece of software that hasn't been packaged - say, a perl module. In the past, it would be really tempting to just fire up CPAN and let it do whatever the hell it is that CPAN does, installing the perl module wherever it sees fit. But puppet knows nothing of CPAN - if you use CPAN, you work against puppet. The "right way" is (and always has been) to build RPMs (or whatever your native package is) and maintain your own repository, but puppet practically forces you to do this. Once you start trusting puppet for everything, you start doing everything in a way that's more maintainable and predictable as a side effect - and that makes you better at your job.
That, in a nutshell, is why I use puppet. Now, onto how I use it.
Puppet guts
My initial assumptions about puppet were that it was basically a dumb configuration file repository - that you throw confs in the puppet master and they get slurped down by the clients, potentially modified by some templating mechanism where a config needs to vary slightly across multiple environments. Indeed, this is a supported (and sometimes necessary) way of distributing configuration information to puppet clients, but after digging in a bit more I realized that there's usually a better way.
Puppet goes beyond the simple "fileserver with templates" paradigm to, effectively, provide an abstraction layer that can describe aspects of a UNIX system in its own dialect. Configuration information is primarily written using the "puppet language," utilizing special "types" which are ruby classes capable of mapping the puppet language into raw configuration details needed by systems. Where these types are inadequate, one can do other lower-level tricks, like directly executing UNIX commands or inserting raw data directly into files.
This is a bit cumbersome to describe, but the following example should help make this more apparent:
service { [ "stunnel" ]:
enable => true,
ensure => running,
subscribe => File[stunnelconf],
}
The "service" type comes with puppet, and it's an abstraction of - surprise - services. It takes many potential arguments, but in my case I'm calling it on a service named "stunnel" and defining "enable" as "true", "ensure" as "running", and "subscribe" as "File[stunnelconf]". In this context, that means that I want the service enabled on boot, that the service should be running (or made to run if it's not) when puppet runs, and that when the "File" resource named "stunnelconf" changes the daemon should be restarted (thus if the configuration changes you need not do a manual restart).
The magic in this is that "service" is smart enough to handle a wide array of different mechanisms for launching and monitoring states of services. On CentOS machines, the puppet "service" type will manage the service with a combination of calling init scripts and running the redhat-specific "chkconfig" mechanism. On a Solaris 10 box, however, this same type would manage stunnel through the SMF system, calling the svcadm utility (or possibly hooking directly into the API - I'm not sure). The beauty here is that the puppet "service" type knows all of this, and the wildly different systems are presented to you as exactly the same construct in the puppet language. I no longer need to care about the differing underlying mechanisms - I tell puppet I want the service turned on, and it does all of the actual work for me.
Things that can be managed with the included puppet "types" include user accounts, groups, yum repositories, packages, file permissions, cron jobs, mail aliases... well, there are quite a few of them, and the puppet type reference goes into great detail on their capabilities.
Now, it would be nice to have native types for every resource, but understandably there are many occasions where no type is available. You could create your own puppet type in ruby to handle such a situation, but this would take a chunk of time and it might not be worth the extra effort.
Luckily, the puppet language itself gives you enough tools to abstract configuration elements through the use of templates and the included "file" type. It's not quite as powerful as writing your own full-fledged type, but it's also much more straightforward and much easier to implement.
As an example, here's a snippet of how I pull in my snmp config:
file {"/etc/snmp/snmpd.conf":
content => template("snmp/snmpd.conf.erb","snmp/$snmpextra.erb"),
mode => 0644,
alias => snmpconf,
}
That "$snmpextra" thing is a puppet variable. In my case here, I have a base snmpd.conf.erb file, which is an ERB template that contains my most basic snmp config. However, I also have an optional additional template which is appended if the $snmpextra variable is defined. In this way, I can keep one "master" configuration, but I can add additional local configurations as needed. Note that the ERB templates themselves can contain ruby code that inserts text based on puppet variables or facts, but they need not do so - they could be a simple configuration file copied directly from a working config.
In case you're wondering what a "fact" is, it's a snippet of system information provided by puppet's "facter" helper utility. Just as puppet types can abstract configuration directives, facter is a standalone utility that's used to abstract the gathering of system metadata. Whenever puppet is run, facter collects a series of "facts" about a system, and these facts can be used to make decisions in the puppet language or within ERB templates.
So, for example, here I check for the $operatingsystem fact and include a different class based on that fact:
class legato::client {
case $operatingsystem {
centos: { include legato::client::centos }
solaris { include legato::client::solaris }
}
}
Note that, in the puppet language, a "class" is not like a "class" in object oriented programming - rather, it describes a bundle of configuration directives, and you can apply them with the "include" statement. In this snippet, I pull in the legato::client::centos class for centos machines, and the legato::client::solaris class for Solaris machines. In cases where there is no native puppet type, you can manage operating system specific details in this way.
Conclusion
That's the basic gist of what puppet can do and how I use it, but there are many details that are documented on its excellent wiki, which you really should read if you're interested in the software. I highly recommend it, even if you're only dealing with a handful of systems - I've come to rely on puppet, not only to help me to get things done, but also to make sure I do them the right way.