X-47 Industries

Exploring tools and infrastructure for systems administration.

Sysadmins Play Defense

On my drive in this morning, a couple of recent conversations merged together into one startling, succinct fact: sysadmins play defense. Given our desire for high availability, and redundancy, we have to. How else are we going to save the users from themselves? But we also do this in management, and this, I feel, is a mistake.

Let me note two particular cases: money and reputation.

A good friend recently said to me, “Jess, saving the company money, is a limited proposition. There’s a bound to how much you can save.” He went on to point out that owners, CEOs and Presidents want to hear ideas that generate growth; done right, growth potential is unlimited. I’ve heard more than one gripe from fellow sysadmins about how money is being wasted; given the cost of our equipment, it’s right to invest capital prudently. However, to make management salivate, let’s illustrate to them how our tech choices can actually grow the business and offer new capabilities.

Reputation is a precious commodity. As sysadmins we are custodians of the email and of the network. Every outage tarnishes; minor outages may just dull the gleam, but major outages will blacken it with ash. Likewise, good reputation is earned around the water cooler or over lunch; those little bits of shared humanity that allow our colleagues to see us and to enjoy our company. A defensive attitude focuses on minimizing outages so that good reputation is never completely depleted. The alternate strategy, turning to offense, is to check on colleagues and make sure they’re doing ok; ask if they have issues and find easy solutions. Likewise, engage with other departments, listen to their pain and work on addressing their needs.

It’s easy to get bogged down in the day to day cares of machines. As we do, we cede initiative and let the conversation slip out of our hands. When we do talk, we become a curmudgeon in rags, standing at a busy city street corner, holding a sign that reads, “The end is nigh.” Let’s not be that guy. Let’s choose to be the friendly neighbor that has tools to borrow; that has advice on how to get a greener lawn, that pitches in when you need help replacing the roof. Let’s play offense, rather than defense.

A Look at Server Provisioning

At work, I’ve had the opportunity to look at Cobbler. It’s magical software that makes great ease of repository management and server deployment. It offers the latter by providing a webUI for managing a kickstart server. The result? Your server install experience should be: rack hardware, run cables, work with the network team for appropriate access and then boot the machine. You’ll then be greeted with a kickstart menu to pick the appropriate OS to install. Your kickstart file will run, leaving you wherever it finishes. At some point, we’ll then want to tie the newly built system into configuration management.

For me, I don’t have hostnames or mac addresses automatically generated/harvested. I see these as convenient unique identifiers, that if I had, I could use to automate OS selection and configuration management install. Cobbler works with neither of these. With effort, kickstart could be made to interpret these, but then you’re off building your own infrastructure, nurturing careful pieces of kickstart to work with an external database to appropriately build systems. Automation is our goal, and so building this kickstart-level magic would be a noble effort if ready-built systems are not already in place.

TheForeman is one such system. It is a webUI for provisioning and managing systems, but it does so with one big assumption: puppet. The software is really focused on complete lifecycle management, and it does it well. If you can accept this integration, then our chain is complete: power on a box, give it some time, and then log into a fully configured system. (Though with everything installed and configured, why log in?)

Also available is Razor. This also comes with strings into puppet; literally, as its a joint project from Puppet Labs and EMC. It provides your running puppetmaster a way to instance server creation. It’s all command line, and meant more as an API to automate resource provisioning. This is all well and good; I like tools that do one thing, and do them well. That said, I’m going to let this lie for a little bit. Ultimately my end-goal for automation is a set of buttons I can turn over to non-sysadmins so they can provision new resources without me. Quite a bit of work is needed to leverage Razor for this.

Both these tools are from the puppet camp. I’m ignorant of tools for other configuration management platforms.

The challenge of automated provisioning is keenly felt in the cloud arena. Cloudstack breaks this problem in twain: orchestration and provisioning. My understanding comes from the slides at Alex Huang’s presentation. Orchestration seems to cover all the coordinatingevents to allow a machine to work (storage assignment, network assignments, &c). Provisioning the actual machine seems to be pushed down to the hypervisor. There are accomodations for bare metal machines, so using this as the base of a deploy process is feasible. However, you’re left with your configuration management platform reaching out and using the cloudstack API.

There’s more research to be done here, obviously; but I wanted to get my understanding down. As well, there doesn’t seem to be much discussion about specific tools for this need. I’m left to believe each cloud platform and each configuration management product is doing it all for itself. That’s a shame, as it doesn’t allow for interoperability or code reuse.

Build Your Own Repos

Repos are good. They elevate civilized operating systems above the neolithic systems that expect you to hunt down software and its dependenies, club them on their heads and then drag them back to your server for the final install. Truly, we’re fortunate to beyond such primitive practices.

But what happens if you’re behind a firewall, isolated to what you can dig from out the earth? What happens if you’ve limited bandwidth, and wish to feed a large population of servers, each hungry for the latest and great software updates?

Rsync is the traditional tool, and it does a grand job. It performs intelligently, analyzing the remote source and the local copy, picking out only the differences; thus it is bandwidth efficient. Also because of this, it can resume if interupted. Finally, passed the right options, it’ll cache all its updates until done, so that the repo remains consistent until the everything is in place (--delay-updates).

Unfortuneately, there are times when the remote repo you wish to mirror doesn’t offer rysnc access. The yum-utils rpm package already has a solution to this: reposync. It’s a straight-forward little utility, that’s easy to use. It’s smart about it’s work, detecting when local copies already exist. But coming from yum-utils, it carries some assumptions with it. These assumptions manifest as quirks we’ll have to learn to live with.

Assumption one: it takes a repoid as source. As such, the local system you’re operating from will need to have the repo installed. This assumption bothers me for two reasons. First, hygiene: what if you want to mirror a repo on a system that doesn’t need the software? Yes, the repo can be set to disabled, but still, I like to keep things clean; I’d much rather not install it in the first place. Second, portability: what if you’re constructing a centralized repo server in a heterogenous environment? What if I wanted to use ubuntu as the base OS? Yeah, I could probably mess around with alien to get the software installed, but that’s a lot of effort for an operation that should be relatively simple.

This second point, is probably more my assumption than the utility’s. With a name like reposync, it seems that given a repo it should output a repo. It does not. It dumps to a directory; and so we’ve got a little bit of prep work to do. As such, yum install createrepo.

So what did I do with all of the above? I generated a local copy of VMware’s OS Specific Packages.

Cheat Sheet:

rpm -ivh http://packages.vmware.com/tools/esx/latest/repos/vmware-tools-repo-RHEL6-9.0.5-1.el6.x86_64.rpm
reposync -r vmware-tools-collection -p /var/www/html/vmware/
createrepo /var/www/html/vmware/vmware-tools-collection/
cp /etc/yum.repos.d/vmware-osps.repo /var/www/html/yum.repos.d/vmware-osps-local.repo
vi vmware-osps-local.repo

As a last step, create a cron job to pull in updates.

Can You Ever Go Home Again?

Terminal sessions are easy enough to generate if you’re in a GUI, but pretend for a moment you’re on a server; no X. You’ve SSH’ed a couple servers away from your starting point before you realize that you need to attend to something on that first server.

SSH gives you two base options. They’re accessible through escape characters. By default, this is ~, but you can customize it should you find it interferes with some other key combination.

Our first option is to suspend the ssh session entirely. Hit ~^Z. That’s two sequences, a ‘~’ + ‘<ctrl>-Z’. This backgrounds the session; at this point, you can control it as you would any other job: fg, bg, jobs -l. Thus, when done with whatever commands needed attending, simply return to the ssh session with fg.

The second option relies on the command mode of SSH. Hit ~C. You’re presented with an ssh comand line prompt. From here, you can invoke commands on the local machine; simply preceed the command with !. For example:

1
2
ssh> !hostname -f
startingpoint.example.com

If you can’t get your business done in a one-liner, try: !bash. You’re now in a complete subshell on the original host. Do what needs doing and when you’ve completed your business, exit as normal and you’re back inside the ssh process and at the remote host.

The premise of this exercise suggested we were a couple servers away. The first SSH session will hold on to the ~, making communicating with subsequent shells arduous. You can hit ~~, which will cause the first shell to pass the character on through to the next machine. To pass on a tilde to the third shell in line, you’d have to hit ~~~~. This quickly becomes impractical. Consider using screen or tmux.

And that’s it, you can go home again; if just to visit.