The Angry Dome- Using git with puppet

S	M	T	W	H	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Using git with puppet

by: Jeremy
file under: geekery
at: Jun 21 2011 15:38
8 Comments (post new)

Git is a powerful tool, and one that I feel Ops folks could use more extensively. Unfortunately, although git has good documentation and excellent tutorials, it mostly assumes that you're working on a software project; the needs of operations are subtly (but vitally) different.

I suppose this is what "devops" is about, but I'm reluctant to use that buzzword. What I can tell you is this: if you take the time to integrate git into your processes, you will be rewarded for your patience.

There is tons of information about how git works and how to accomplish specific tasks in git. This isn't about that; this is a real-world example with every single git command included. I only barely hint at what git is capable of, but I hope that I give you enough information to get started with using git in your puppet environment. If you have any specific questions, don't worry, because help is out there.

A trivial workflow with git for accountability and rollbacks

I've long used git for puppet (and, indeed, any other text files) in a very basic way. My workflow has been, essentially:

1) navigate to a directory that contains text files that might change

2) turn it into a git repo with:

git init; git add *; git commit -a -m "initial import"

3) whenever I make changes, I do:

git add *; git commit -a -m "insert change reason here"

This simple procedure manages to solve several problems: you have accountability for changes, you have the ability to roll back changes, you have the ability to review differences between versions.

One very attractive feature of git (as compared to svn or cvs) is that one can create a git repo anywhere, and one needs no backing remote repository or infrastructure to do so; the overhead of running git is so minimal that there's no reason not to use it.

This workflow, though, does not scale. It's fine and dandy when you have one guy working on puppet at a time, but multiple people can easily step on each other's changes. Furthermore, although you have the ability to roll back, you're left with a very dangerous way to test changes: that is, you have to make them on the live puppet instance. You can roll your config back, but by the time you do so it might already be too late.

A simple workflow with git branches and puppet environments

It's time to move beyond the "yes, I use version control" stage into the "yes, I use version control, and I actually test changes before pushing them to production" stage.

Enter the puppet "environment" facility - and a git workflow that utilizes it. Puppet environments allow you to specify an alternate configuration location for a subset of your nodes, which provides an ideal way for us to verify our changes; instead of just tossing stuff into /etc/puppet and praying, we can create an independent directory structure for testing and couple that with a dedicated git branch. Once satisfied with the behavior in our test environment, we can then apply those changes to the production environment.

The general workflow

This workflow utilizes an authoritative git repository for the puppet config, with clones used for staging, production, and ad-hoc development. This git repository will contain multiple branches; of particular import will be a "production" branch (that will contain your honest-to-goodness production puppet configuration) and a "staging" branch (which will contain a branch designed to verify changes). Puppet will be configured to use two or more locations on the filesystem (say, /etc/puppet and /etc/puppet-staging) which will be clones from the central repository and will correspond to branches therein. All changes to puppet should be verified by testing a subset of nodes against the configuration in /etc/puppet-staging (on the "staging" branch), and once satisfied with the results they are merged into the "production" branch, and ultimately pulled into /etc/puppet.

Here's what it looks like:

/opt/puppet-git: authoritative git repository. I will refer to it by this location but in your deployment it could be anywhere (remote https, ssh, whatever). Contains at a minimum a "production" branch and a "staging" branch, but optionally may contain many additional feature branches. Filesystem permissions must be read/write by anybody who will contribute to your puppet configuration

/etc/puppet-staging: git repository that is cloned from /opt/puppet-git that always has the "staging" branch checked out. Filesystem permissions must be read/write by anybody who will push changes to staging (consider limiting this to team leads or senior SAs)

/etc/puppet: git repository that is cloned from /opt/puppet-git that always has the "production" branch checked out. Filesystem permissions must be read/write by anybody who will push changes from staging to production (again, consider limiting this to team leads or senior SAs)

The key element here is the authoritative repository (/opt/puppet-git). Changes to the production repository (in /etc/puppet) should never be made directly; rather, you will 'git pull' the production branch from the authoritative repository. The staging repository (and the "staging" branch) is where changes must occur first; when QA is satisfied, the changes from the staging branch will be merged into the production branch, the production branch will be pushed to the authoritative repository, and the production repository will pull those changes into it.

Why do I have three git repositories?

You might be saying to yourself: self, why do I have all of these repositories? Can't I just use the repository in /etc/puppet or /etc/puppet-staging as my authoritative repository? Why have the intermediary step?

There are a couple of reasons for this:

One, you can use filesystem permissions to prevent accidental (or intentional) modification directly to the /etc/puppet or /etc/puppet-staging directories. For example, the /etc/puppet repository may be writeable only by root, but the /etc/puppet-staging repository may be writeable by anybody in the puppet group. With this configuration anybody in the puppet group can mess with staging, but only somebody with root can promote those changes to production.

Two, some git operations (e.g. merging and rebasing) require that you do some branch switcheroo voodoo, and (at least in production) we can't have our repository in an inconsistent state while we're doing so. Furthermore, git documentation recommends in general that you never 'push' changes to branches over an active checkout of the same branch; by using a central repository, we don't have to deal with this issue.

Of course, one advantage of git is its sheer flexibility. You might decide that the staging repository would make a good authoritative source for your configuration, and that's totally fine. I only present my workflow as an option that you can use; it's up to you to determine which workflow fits best in your environment.

Initial prep work

Step 0: determine your filesystem layout and configure puppet for multiple environments

RPM versions of puppet default to using /etc/puppet/puppet.conf as their configuration file. If you've already been using puppet, you likely use /etc/puppet/manifests/site.pp and /etc/puppet/modules/ as the locations of your configuration. You may continue to use this as the default location if you wish.

In addition to the "production" configuration, we must specify additional puppet environments. Modify puppet.conf to include sections by the names of each "environment" you wish to use. For example, my puppet.conf is as follows:

[main]

... snip ...

    manifest = /etc/puppet/manifests/site.pp

    modulepath = /etc/puppet/modules

[staging]

    manifest = /etc/puppet-staging/manifests/site.pp

    modulepath = /etc/puppet-staging/modules

You may configure multiple, arbitrary environments in this manner. For example, you may have per-user environments in home directories:

[jeremy]

    manifest = /home/jeremy/puppet/manifests/site.pp

    modulepath = /home/jeremy/puppet/modules

It is also possible to use the $environment variable itself to allow for arbitrary puppet environments. If you have a great many puppet administrators, that may be preferable to specifying a repository for each administrator individually.

Step 1: create your authoritative repository

If you're coming from the trivial usage of git that I outlined at the start of this post, you already have a repository in /etc/puppet.

If you're already using git for your puppet directory, just do the following:

cp -rp /etc/puppet /opt/puppet-git

If you aren't already using git, that's no problem; do the cp just the same, and then:

cd /opt/puppet-git; git init .; git add *; git commit -a -m "initial import"

Step 2: set up branches in your authoritative repository

Now we have our new central repository, but we need to add the branches we need:

cd /opt/puppet-git; git branch production; git branch staging

Do note that I didn't "check out" either of those branches here; I'm just leaving puppet-git on "master" (which in truth we'll never use). NB: you might consider making this a "bare" repository, as it's never meant to be modified directly

Step 3: set up your "staging" git repository

As configured in step 0, we have an environment where we can test changes in puppet, but right now there's no configuration there (in such a case, nodes in the "staging" environment will use the default puppet configuration). We need to populate this environment with our existing configuration; let's create a clone of our git repo:

git clone /opt/puppet-git /etc/puppet-staging

We now have our copy of the repository, including both of its branches. Let's switch to the "staging" branch:

cd /etc/puppet-staging; git checkout staging

Step 4: set up your "production" git repository

This is essentially the same as step 3, with one twist - we already have something in /etc/puppet. While it's possible to turn /etc/puppet into a git repository with the proper remote relationship to our authoritative repository, I find it's easiest to just mv it out of the way and do a new checkout. Be sure to stop your puppet master while you do this!

service puppetmaster stop; mv /etc/puppet /etc/puppet.orig; git clone /opt/puppet-git /etc/puppet;

cd /etc/puppet; git checkout production; service puppetmaster start

Workflow walkthrough

In this configuration, it is assumed that all changes must pass through the "staging" branch and be tested in the "staging" puppet environment. People must never directly edit files in /etc/puppet, or they will cause merge headaches. They should also never do any complex git operations from within /etc/puppet; instead, these things must be done either through per-user clones or through the staging clone, and then pushed up to the authoritative repository once complete.

This may sound confusing, but hopefully the step-by-step will make it clear.

Step 0: set up your own (user) git repository and branch

While optional for trivial changes, this step is highly recommended for complex changes, and almost required if you have multiple puppet administrators working at the same time. This gives you a clone of the puppet repository on which you are free to work without impacting anybody else.

First, create your own clone of the /opt/puppet-git repository:

git clone /opt/puppet-git /home/jeremy/puppet

Next, create and switch to your own branch:

cd ~/puppet/; git checkout -b jeremy

In the sample puppet.conf lines above, I've already enabled an environment that refers to this directory, so we can start testing nodes against our changes by setting their environments to "jeremy"

Step 1: update your local repository

This is not needed after a fresh clone, but it's a good idea to frequently track changes on your local puppet configuration to ensure a clean merge later on. To apply changes from the staging branch to your own branch, simply do the following:

cd ~/puppet/; git checkout staging; git pull; git checkout jeremy; git rebase staging

This will ensure that all changes made to the "staging" branch in the authoritative repository are reflected in your local repository.
NB: I like to use "rebase" on local-only branches, but you may prefer "merge." In any case that I mention "rebase", "merge" should work too
EDIT: I originally had a 'git pull --all' in this example; use 'git fetch --all' instead

Step 2: make changes in your working copy and test

Now, you can make changes locally to your heart's content, being sure to use 'git commit' and 'git add' frequently (since this is a local branch, none of these changes will impact anybody else). Once you have made changes and you're ready to test, you can selectively point some of your development systems at your own puppet configuration; the easiest way to do so is to simply set the "environment" variable to (e.g.) "jeremy" in the node classifier. If you aren't using a node classifier, first, shame on you! Second, you can also make this change in puppet.conf

step 3: ensure (again) that you're up to date, and commit your changes

If you're happy with the changes you've made, commit them to your own local branch:

cd ~/puppet; git add *; git commit -a -m "my local branch is ready"

Then, just like before, we want to make sure that we're current before we try to merge our changes to staging:

cd ~/puppet; git fetch --all; git checkout staging; git pull; git checkout jeremy; git rebase staging

BIG FRIGGIN WARNING: This is where stuff might go wrong if other people were changing the same things you changed, and committed them to staging before you. You need to be very certain that this all succeeds at this point.

step 4: merge your branch into "staging"

Now that you have a valid puppet configuration in your local repository, you must apply this configuration to staging. In an environment with well defined processes, this step may require authorization by a project manager or team lead who will be the "gatekeeper" to the staging environment. At the very least, let your team members know that staging is frozen while you test your changes. Changes to staging (and by extension production) must be done serially. The last thing you want is multiple people making multiple changes at the same time.

When you're ready, first merge your branch to the staging branch in your local repository:

cd ~/puppet; git checkout staging; git merge jeremy

Assuming that the merge is clean, push it back up to the central repository:

git push

Step 5: update puppet's staging configuration

Now the central repository has been updated with our latest changes and we're ready to test on all "staging" nodes. On the puppet server, go to your staging directory and update from the authoritative repo:

cd /etc/puppet-staging; git pull

Step 6: test changes on staging puppet clients

If you live in the world of cheap virtual machines, free clones, and a fully staffed IT department, you'll have a beautiful staging environment where your puppet configuration can be fully validated by your QA team. In the real world, you'll probably add one or two nodes to the "staging" environment and, if they don't explode within an hour or two, it's time to saddle up.

If you have to make a minor change at this point, you may directly edit the file in /etc/puppet-staging, commit them with 'git commit -a', and then perform a 'git push' to put them in the authoritative repo; if you have a rigid change control procedure in place, you may need to roll back staging and go all the way back to step 1.

As a general rule: try to keep staging as close to production as possible and if possible only test one change in staging at a time. Don't let multiple changes pile up in staging; push to staging only when you're really ready. If a lot of people are waiting to make changes, they should confine them to their own branches until such a time as staging has "caught back up" to production.

Step 7: apply changes to production

Once staging has been verified, you need to merge into production. Again, this step may require authorization from a project manager or team lead who will sign off on the final changes.

Although it is possible to do this directly from the git repo in /etc/puppet-staging, I recommend that you use your own clone so as to leave staging in a consistent state throughout the process.

Start by again updating from the authoritative repo:

cd ~/puppet; git fetch --all; git checkout staging; git pull; git checkout production; git pull

At this point you should manually verify that there aren't any surprises lurking in staging that you don't expect to see:

git diff staging

If everything looks good, apply your changes and send them up to the repo:

git merge staging && git push

Step 8: final pull to production

If you're dead certain that nobody will ever monkey around directly in /etc/puppet, you can just pull down the changes and you're done:

cd /etc/puppet; git pull

That's fine and dandy if everybody follows process, but it may cause trouble if anybody has mucked around directly in /etc/puppet. To be sure that nothing unexpected is going on, you may want to use a different procedure to verify the diff first:

cd /etc/puppet; git fetch; git diff origin/production

Once satisfied, apply the changes to the live configuration:

git merge origin/production

That's great, but what's the catch?

Oh, there's always a catch.

I think the process outlined here is sound, but when you add the human element there will be problems. Somebody will eventually edit something in /etc/puppet and un-sync the repos. Somebody will check something crazy into staging when you're getting ready to push your change to production. Somebody will rebase when they should've merged and merge conflicts will blow up in your face. A million other tiny problems lurk around the corner.

The number one thing to remember with git: stay calm, because you can always go back in time. Even if something goes to hell, every step of the way has a checkpoint.

If you follow the procedure (and I don't mean "this" procedure; I just mean "a" procedure) your chance of pain is greatly reduced. Git is version control without training wheels; it will let you shoot yourself in the foot. It's up to you to follow the procedure, because git will not save you from yourself.

Comments

Mehdi @ Fri Nov 04 10:25:49 -0400 2011

Very good article.
I have an error with the laste command :
git merge
what to merge?
Thanks.

Jeremy @ Wed Nov 23 10:31:09 -0500 2011

Mehdi:

Sorry, I made an error there. You have to specify the branch; specifically it should be:

git merge origin/production

I will update the original article.

Jon @ Thu Mar 01 16:04:52 -0500 2012

Jeremy,

Thanks for your excellent work documenting this process. It is exactly what I was looking for. One thing I am unsure about is file ownership. I guess everything should be owned by puppet so the puppetmaster can run it? Or do you just make sure that the group is set so puppetmaster can run it and the user can be different on each branch? Working on understanding git a little better...

Thanks.

Jeremy @ Thu Apr 12 10:46:09 -0400 2012

Jon,

You are correct, file permissions need to allow whatever user the puppetmaster runs as to read all of the files in all of its environments.

There are several approaches to doing this; probably the easiest is to make sure your master and your users are all in a shared group, and that all of your checkouts are readable by that group.

It might be desireable to do this outside of your home directory so as to avoid having to tinker with permissions there, and simply set the setgid bit on that location to a group in which the puppetmaster belongs. So, instead of $username/puppet-git, try /puppet-git/$username, and chown :puppetmaster /puppet-git; chmod g+ws /puppet-git before doing any checkouts.

Keep in mind that if you're using puppet's dumb fileserver mode, permissions will reflect the actual filesystem permissions on the master by default, which is probably not what you want. You will need to explicitly state file ownership within the DSL (which is probably a good idea, anyway).

alan @ Tue Jul 24 17:00:21 -0400 2012

Hi Jeremy,

Based on this method, wouldn't the puppet configuration files within manifests also be pushed into the branches? For example, if I'am a developer and I pull from the main branch, I will have a local copy of /etc/puppet. I then have to change the configuration to work with my own agent nodes and environments for testing. After I've tested, I'm pushing my changes back into the staging branch, ALONG with my own configuration files. But that's not desirable, because then soon we'll be dealing with multiple merge conflicts as the number of contributors grow.

So, I am assuming that you mean we are only pushing the changes we've made to the modules, correct?

Jeremy @ Mon Jul 30 12:25:56 -0400 2012

@Alan:

You are correct - in this example it is assumed there is Only One puppet.conf which is expected to work unmodified on all of your production and staging puppet masters. If you need a different puppet.conf file for your testing environments, this will become a problem very quickly! You have many options! Here are a few ideas:

1) Put your git-managed puppet manifests and modules somewhere that's not under /etc/puppet (e.g.: /etc/puppet-environments/production) and manage the stuff that really has to go in /etc/puppet in a different git repo (this seems good to me!)

2) You could just .gitignore puppet.conf and always manage it locally (though, most likely you do want it in git!)

3) Rather than check in an actual puppet.conf file, you could check in a symlink to a puppet.conf in a different location, and maintain that file in a different git repository.

4) Hope that you remember to never commit the environment-specific changes to puppet.conf! (that's a great idea until somebody forgets...)

Pere Hospital @ Wed Sep 19 05:45:20 -0400 2012

Hi there,

Great post.

One small (I guess) issue. For the workflow walkthrough step 1, when I run git pull --all I get:

Fetching origin
You asked to pull from the remote '--all', but did not specify
a branch. Because this is not the default configured remote
for your current branch, you must specify a branch on the command line.

Haven't seen any one commenting it so wondering what I am doing wrong.

Thx!

Pere

Jeremy @ Wed Jan 09 15:06:25 -0500 2013

Pere:

You'll get that error if you're currently working in a branch that doesn't exist in the remote repository; which, if you're following this tutorial is what you're doing! It can be safely ignored, but it's also pointless to do it in the first place.

I've since edited the post to reflect that change. Thanks for the pointer!

New Comment

Comments |Back

The Angry Dome

Archives