Git is a powerful tool, and one that I feel Ops folks could use more extensively. Unfortunately, although git has good documentation and excellent tutorials, it mostly assumes that you're working on a software project; the needs of operations are subtly (but vitally) different.
I suppose this is what "devops" is about, but I'm reluctant to use that buzzword. What I can tell you is this: if you take the time to integrate git into your processes, you will be rewarded for your patience.
There is tons of information about how git works and how to accomplish specific tasks in git. This isn't about that; this is a real-world example with every single git command included. I only barely hint at what git is capable of, but I hope that I give you enough information to get started with using git in your puppet environment. If you have any specific questions, don't worry, because help is out there.
A trivial workflow with git for accountability and rollbacks
I've long used git for puppet (and, indeed, any other text files) in a very basic way. My workflow has been, essentially:
1) navigate to a directory that contains text files that might change
2) turn it into a git repo with:
git init; git add *; git commit -a -m "initial import"
3) whenever I make changes, I do:
git add *; git commit -a -m "insert change reason here"
This simple procedure manages to solve several problems: you have accountability for changes, you have the ability to roll back changes, you have the ability to review differences between versions.
One very attractive feature of git (as compared to svn or cvs) is that one can create a git repo anywhere, and one needs no backing remote repository or infrastructure to do so; the overhead of running git is so minimal that there's no reason not to use it.
This workflow, though, does not scale. It's fine and dandy when you have one guy working on puppet at a time, but multiple people can easily step on each other's changes. Furthermore, although you have the ability to roll back, you're left with a very dangerous way to test changes: that is, you have to make them on the live puppet instance. You can roll your config back, but by the time you do so it might already be too late.
A simple workflow with git branches and puppet environments
It's time to move beyond the "yes, I use version control" stage into the "yes, I use version control, and I actually test changes before pushing them to production" stage.
Enter the puppet "environment" facility - and a git workflow that utilizes it. Puppet environments allow you to specify an alternate configuration location for a subset of your nodes, which provides an ideal way for us to verify our changes; instead of just tossing stuff into /etc/puppet and praying, we can create an independent directory structure for testing and couple that with a dedicated git branch. Once satisfied with the behavior in our test environment, we can then apply those changes to the production environment.
The general workflow
This workflow utilizes an authoritative git repository for the puppet config, with clones used for staging, production, and ad-hoc development. This git repository will contain multiple branches; of particular import will be a "production" branch (that will contain your honest-to-goodness production puppet configuration) and a "staging" branch (which will contain a branch designed to verify changes). Puppet will be configured to use two or more locations on the filesystem (say, /etc/puppet and /etc/puppet-staging) which will be clones from the central repository and will correspond to branches therein. All changes to puppet should be verified by testing a subset of nodes against the configuration in /etc/puppet-staging (on the "staging" branch), and once satisfied with the results they are merged into the "production" branch, and ultimately pulled into /etc/puppet.
Here's what it looks like:
/opt/puppet-git: authoritative git repository. I will refer to it by this location but in your deployment it could be anywhere (remote https, ssh, whatever). Contains at a minimum a "production" branch and a "staging" branch, but optionally may contain many additional feature branches. Filesystem permissions must be read/write by anybody who will contribute to your puppet configuration
/etc/puppet-staging: git repository that is cloned from /opt/puppet-git that always has the "staging" branch checked out. Filesystem permissions must be read/write by anybody who will push changes to staging (consider limiting this to team leads or senior SAs)
/etc/puppet: git repository that is cloned from /opt/puppet-git that always has the "production" branch checked out. Filesystem permissions must be read/write by anybody who will push changes from staging to production (again, consider limiting this to team leads or senior SAs)
The key element here is the authoritative repository (/opt/puppet-git). Changes to the production repository (in /etc/puppet) should never be made directly; rather, you will 'git pull' the production branch from the authoritative repository. The staging repository (and the "staging" branch) is where changes must occur first; when QA is satisfied, the changes from the staging branch will be merged into the production branch, the production branch will be pushed to the authoritative repository, and the production repository will pull those changes into it.
Why do I have three git repositories?
You might be saying to yourself: self, why do I have all of these repositories? Can't I just use the repository in /etc/puppet or /etc/puppet-staging as my authoritative repository? Why have the intermediary step?
There are a couple of reasons for this:
One, you can use filesystem permissions to prevent accidental (or intentional) modification directly to the /etc/puppet or /etc/puppet-staging directories. For example, the /etc/puppet repository may be writeable only by root, but the /etc/puppet-staging repository may be writeable by anybody in the puppet group. With this configuration anybody in the puppet group can mess with staging, but only somebody with root can promote those changes to production.
Two, some git operations (e.g. merging and rebasing) require that you do some branch switcheroo voodoo, and (at least in production) we can't have our repository in an inconsistent state while we're doing so. Furthermore, git documentation recommends in general that you never 'push' changes to branches over an active checkout of the same branch; by using a central repository, we don't have to deal with this issue.
Of course, one advantage of git is its sheer flexibility. You might decide that the staging repository would make a good authoritative source for your configuration, and that's totally fine. I only present my workflow as an option that you can use; it's up to you to determine which workflow fits best in your environment.
Initial prep work
Step 0: determine your filesystem layout and configure puppet for multiple environments
RPM versions of puppet default to using /etc/puppet/puppet.conf as their configuration file. If you've already been using puppet, you likely use /etc/puppet/manifests/site.pp and /etc/puppet/modules/ as the locations of your configuration. You may continue to use this as the default location if you wish.
In addition to the "production" configuration, we must specify additional puppet environments. Modify puppet.conf to include sections by the names of each "environment" you wish to use. For example, my puppet.conf is as follows:
[main]
... snip ...
manifest = /etc/puppet/manifests/site.pp
modulepath = /etc/puppet/modules
[staging]
manifest = /etc/puppet-staging/manifests/site.pp
modulepath = /etc/puppet-staging/modules
You may configure multiple, arbitrary environments in this manner. For example, you may have per-user environments in home directories:
[jeremy]
manifest = /home/jeremy/puppet/manifests/site.pp
modulepath = /home/jeremy/puppet/modules
It is also possible to use the $environment variable itself to allow for arbitrary puppet environments. If you have a great many puppet administrators, that may be preferable to specifying a repository for each administrator individually.
Step 1: create your authoritative repository
If you're coming from the trivial usage of git that I outlined at the start of this post, you already have a repository in /etc/puppet.
If you're already using git for your puppet directory, just do the following:
cp -rp /etc/puppet /opt/puppet-git
If you
aren't already using git, that's no problem; do the cp just the same, and then:
cd /opt/puppet-git; git init .; git add *; git commit -a -m "initial import"
Step 2: set up branches in your authoritative repository
Now we have our new central repository, but we need to add the branches we need:
cd /opt/puppet-git; git branch production; git branch staging
Do note that I didn't "check out" either of those branches here; I'm just leaving puppet-git on "master" (which in truth we'll never use).
NB: you might consider making this a "bare" repository, as it's never meant to be modified directly
Step 3: set up your "staging" git repository
As configured in step 0, we have an environment where we can test changes in puppet, but right now there's no configuration there (in such a case, nodes in the "staging" environment will use the default puppet configuration). We need to populate this environment with our existing configuration; let's create a clone of our git repo:
git clone /opt/puppet-git /etc/puppet-staging
We now have our copy of the repository, including both of its branches. Let's switch to the "staging" branch:
cd /etc/puppet-staging; git checkout staging
Step 4: set up your "production" git repository
This is essentially the same as step 3, with one twist - we already have something in /etc/puppet. While it's
possible to turn /etc/puppet into a git repository with the proper remote relationship to our authoritative repository, I find it's easiest to just mv it out of the way and do a new checkout. Be sure to stop your puppet master while you do this!
service puppetmaster stop; mv /etc/puppet /etc/puppet.orig; git clone /opt/puppet-git /etc/puppet;
cd /etc/puppet; git checkout production; service puppetmaster start
Workflow walkthrough
In this configuration, it is assumed that
all changes must pass through the "staging" branch and be tested in the "staging" puppet environment. People must
never directly edit files in /etc/puppet, or they will cause merge headaches. They should also
never do any complex git operations from within /etc/puppet; instead, these things must be done either through per-user clones or through the staging clone, and then pushed up to the authoritative repository once complete.
This may sound confusing, but hopefully the step-by-step will make it clear.
Step 0: set up your own (user) git repository and branch
While optional for trivial changes, this step is highly recommended for complex changes, and almost required if you have multiple puppet administrators working at the same time. This gives you a clone of the puppet repository on which you are free to work without impacting anybody else.
First, create your own clone of the /opt/puppet-git repository:
git clone /opt/puppet-git /home/jeremy/puppet
Next, create and switch to your own branch:
cd ~/puppet/; git checkout -b jeremy
In the sample puppet.conf lines above, I've already enabled an environment that refers to this directory, so we can start testing nodes against our changes by setting their environments to "jeremy"
Step 1: update your local repository
This is not needed after a fresh clone, but it's a good idea to frequently track changes on your local puppet configuration to ensure a clean merge later on. To apply changes from the staging branch to your own branch, simply do the following:
cd ~/puppet/; git checkout staging; git pull; git checkout jeremy; git rebase staging
This will ensure that all changes made to the "staging" branch in the authoritative repository are reflected in your local repository.
NB: I like to use "rebase" on local-only branches, but you may prefer "merge." In any case that I mention "rebase", "merge" should work too
EDIT: I originally had a 'git pull --all' in this example; use 'git fetch --all' instead
Step 2: make changes in your working copy and test
Now, you can make changes locally to your heart's content, being sure to use 'git commit' and 'git add' frequently (since this is a local branch, none of these changes will impact anybody else). Once you have made changes and you're ready to test, you can selectively point some of your development systems at your own puppet configuration; the easiest way to do so is to simply set the "environment" variable to (e.g.) "jeremy" in the node classifier. If you aren't using a node classifier, first, shame on you! Second, you can also make this change in puppet.conf
step 3: ensure (again) that you're up to date, and commit your changes
If you're happy with the changes you've made, commit them to your own local branch:
cd ~/puppet; git add *; git commit -a -m "my local branch is ready"
Then, just like before, we want to make sure that we're current before we try to merge our changes to staging:
cd ~/puppet; git fetch --all; git checkout staging; git pull; git checkout jeremy; git rebase staging
BIG FRIGGIN WARNING: This is where stuff might go wrong if other people were changing the same things you changed, and committed them to staging before you. You need to be very certain that this all succeeds at this point.
step 4: merge your branch into "staging"
Now that you have a valid puppet configuration in your local repository, you must apply this configuration to staging. In an environment with well defined processes, this step may require authorization by a project manager or team lead who will be the "gatekeeper" to the staging environment.
At the very least, let your team members know that staging is frozen while you test your changes. Changes to staging (and by extension production)
must be done serially. The last thing you want is multiple people making multiple changes at the same time.
When you're ready, first merge your branch to the staging branch in your local repository:
cd ~/puppet; git checkout staging; git merge jeremy
Assuming that the merge is clean, push it back up to the central repository:
git push
Step 5: update puppet's staging configuration
Now the central repository has been updated with our latest changes and we're ready to test on all "staging" nodes. On the puppet server, go to your staging directory and update from the authoritative repo:
cd /etc/puppet-staging; git pull
Step 6: test changes on staging puppet clients
If you live in the world of cheap virtual machines, free clones, and a fully staffed IT department, you'll have a beautiful staging environment where your puppet configuration can be fully validated by your QA team. In the real world, you'll probably add one or two nodes to the "staging" environment and, if they don't explode within an hour or two, it's time to saddle up.
If you have to make a minor change at this point, you may directly edit the file in /etc/puppet-staging, commit them with 'git commit -a', and then perform a 'git push' to put them in the authoritative repo; if you have a rigid change control procedure in place, you may need to roll back staging and go all the way back to step 1.
As a general rule: try to keep staging as close to production as possible and if possible only test one change in staging at a time. Don't let multiple changes pile up in staging; push to staging only when you're really ready. If a lot of people are waiting to make changes, they should confine them to their own branches until such a time as staging has "caught back up" to production.
Step 7: apply changes to production
Once staging has been verified, you need to merge into production. Again, this step may require authorization from a project manager or team lead who will sign off on the final changes.
Although it is possible to do this directly from the git repo in /etc/puppet-staging, I recommend that you use your own clone so as to leave staging in a consistent state throughout the process.
Start by again updating from the authoritative repo:
cd ~/puppet; git fetch --all; git checkout staging; git pull; git checkout production; git pull
At this point you should manually verify that there aren't any surprises lurking in staging that you don't expect to see:
git diff staging
If everything looks good, apply your changes and send them up to the repo:
git merge staging && git push
Step 8: final pull to production
If you're dead certain that nobody will ever monkey around directly in /etc/puppet, you can just pull down the changes and you're done:
cd /etc/puppet; git pull
That's fine and dandy if everybody follows process, but it may cause trouble if anybody has mucked around directly in /etc/puppet. To be sure that nothing unexpected is going on, you may want to use a different procedure to verify the diff first:
cd /etc/puppet; git fetch; git diff origin/production
Once satisfied, apply the changes to the live configuration:
git merge origin/production
That's great, but what's the catch?
Oh, there's always a catch.
I think the process outlined here is sound, but when you add the human element there will be problems. Somebody will eventually edit something in /etc/puppet and un-sync the repos. Somebody will check something crazy into staging when you're getting ready to push your change to production. Somebody will rebase when they should've merged and merge conflicts will blow up in your face. A million other tiny problems lurk around the corner.
The number one thing to remember with git: stay calm, because you can always go back in time. Even if something goes to hell, every step of the way has a checkpoint.
If you follow the procedure (and I don't mean "this" procedure; I just mean "a" procedure) your chance of pain is greatly reduced. Git is version control without training wheels; it will let you shoot yourself in the foot. It's up to you to follow the procedure, because git will not save you from yourself.