Martin Maisey

6 minute read

Time for one of my rare blog posts - which I’ll follow up with another on a related topic …

I’ve been meaning to get to grips with an IT automation tool such as Puppet, Chef, or SaltStack for ages now, but one of the perennial issues of having a “technical architect” consulting job is that I rarely have the excuse to sit down and do hands-on stuff with new technology on client time. This isn’t a complaint about my work - it’s great, and I love it - but inevitably the working day ends up being much more about talking to people, explaining things via slide decks/Visios etc, and reviewing the “real work” that other people do. A lot of what I learn has to take place in experiments in private time, and it’s amazing how little of that there is when you’ve got a very active one-year-old…

At last, need and opportunity have converged in the last couple of months. First, I needed to rebuild the cloudy infrastructure for <www.closethedoor.org.uk> as a result of my own rookie error - I’d forgotten to put in place log rotation for Apache, and sure enough managed to fill up the root disk. I know, I know… but at least the automated Route 53 based failover to my Dreamhost hosting account worked seamlessly (thank heavens).

The manual build notes I’d taken were a start, but - as these things always end up - incomplete and prone to error. Rather than hack it again, I decided to do a proper job this time around. I started learning Puppet (it’s the one we use most at work), with the aim of fully automating the server build. However, while it could clearly do the job, it just didn’t feel quite right to me. The abstractions didn’t seem very intuitive or clean, which meant a not particularly enjoyable experience as a newbie trying to get things done.

It wasn’t immediately obvious how to structure modules well, or reuse other people’s in a good way - there is documentation on this, but it’s somewhat obscure to a beginner and doesn’t point you to obviously useful third-party tools such as librarian-puppet. It felt needlessly verbose - do I really have to be explicit about dependencies by default, when 95% of the time I’d be happy with a linear flow, implicit dependencies and a deterministic running order? There are also quite a lot of up front installation dependencies and decisions to wade through - master/masterless etc. These are quite a turnoff when you’re just getting started.

I’m sure a lot of this frustration just melts away when you’re used to it. But in my experience the best tools offer a gradual learning curve and slowly unfurl their subtlety and complexity to you over time. In contrast, Puppet seems to blast you right in the face with it. This is particularly unfortunate in the DevOps space, since there are few people (currently) who are experts in both development and operations and do IT automation as their primary job. For most users it will be a secondary tool rather than their primary one. Developers will need to fit using it into their busy functional delivery sprints, and the Ops people will be using the odd few minutes between firefighting production issues. I think the steep learning curve will ultimately limit its take-up - so I decided that before investing lots of effort, I’d check out the alternatives.

While reading, I came across an option I’d not heard before - Ansible. It’s no doubt a great name if you’re into geek sci-fi, but if you’re not it can sound a little unmemorable (IMHO). This is a bit of a shame - but as soon as I looked at the tutorials I instantly clicked with it.

The documentation is fantastic, but it also makes learning by example really easy. The YAML/Jinja2 based configuration syntax is very intuitive, even if you haven’t used either before (I hadn’t). The tools help out by giving you friendly error messages almost all of the time, often proffering genuinely helpful hints on how to fix syntax errors. All it needs on the target machine is working SSH and Python, and the dependencies list for Ansible itself is short enough that I could use my laptop as an ad-hoc ‘master’ to get going. It also works well in a completely masterless mode.

Batteries are included in the form of a large number of high-quality core modules (written in Python, though you can write them in anything you want), so you don’t have to go trying to piece together a load of GitHub repositories. The default ‘playbook’ layouts make a lot of sense. There’s a great example repo with a number of well-written playbooks that show exactly how to piece the modules together to do useful things at the scale of ‘create a high availability LAMP stack’, ‘Deploy Hadoop’ or ‘Deploy Openshift’.

I should point out that Ansible didn’t have a good reuse infrastructure above the module level either. It does now - that will be the subject of my follow-up post). But I decided the built-in modules looked productive enough that I just didn’t care.

Second, having very successfully completed (bar some tidying up) the closethedoor project, I found myself doing an open source ESB evaluation for a client. I needed to create clustered HA and performance test environments for two ESBs as part of a proof of concept development. As the team size was precisely one - me - I had an excuse to get properly hands-on myself. Because the work is a/ throwaway and b/ very time pressured, I could pick the most productive tool for me rather than having to conform to the client’s existing standard (Puppet, natch). Needless to say, I picked Ansible.

To cut a long story short, it’s been everything I could have hoped and more. Every time you need to do something a little more complex, exploring the docs shows that it’s been thought about and there’s an elegant solution waiting just around the corner. I have not hit a single bug with Ansible itself during this time - it feels rock solid. If you feel the need to scale to hundreds or thousands of machines, it looks like there’s the headroom there to do that (caveat: I’ve not actually tried); it’s just hidden from you until you need it. The command line tools are fully open source and completely usable for big projects. The pay-for AWX console that supports the development is sugar on the top, but will make it much more palatable for enterprises.

For me, Ansible has joined that small list of truly great tools, like Git and many of the original Unix command line family, that do the simple cases entirely right - and are based on robust and powerful enough abstractions to make the complex ones work too. There’s no ‘work-in-progress’ feel about them; there’s a good chance these tools will be the same we’re using in decade.

Thank you, AnsibleWorks, for making my last few months nicer and more productive.

comments powered by Disqus