Me Andrew Gaffney
IRC: agaffney (on Freenode and OFTC)
  Resume (HTML)
  Resume (PDF)
  LinkedIn Profile

Andrew M. Gaffney
Broomfield, CO 80020
Printable version


Operating Systems: Linux (Slackware, Gentoo, Ubuntu, Debian, RedHat Enterprise, CentOS, and Scientific Linux), Windows (3.x/9x/ME/NT/2000/XP/2003), DOS, Mac OS (8/9/10)

Programming Languages: Perl, C, C++/Visual C++, Java, BASIC/Visual Basic, Python, PHP, Bash, Ruby, Go

Software and services:

Network services: DHCP (ISC dhcpd and dnsmasq), DNS (BIND and dnsmasq), SMTP (postfix and qmail) and IMAP/POP3 (courier and dovecot), HTTP (apache w/ mod_php, mod_perl, mod_ssl, lighttpd, squid, nginx), HAProxy, Tomcat, memcache, NFS, CIFS, iSCSI, LDAP (AD and OpenLDAP) and Radius (Freeradius), SNMP, CVS, SVN, Git

Monitoring/Security: nagios, cacti, Zabbix, OSSEC, and Stackstorm

Server management: central configuration management with Puppet and Ansible, central authentication using pam_ldap/nss_ldap/nslcd with Active Directory and OpenLDAP

Clustering: Hadoop (HDFS and MapReduce), Zookeeper, etcd, Consul, Mesos/Marathon

Continuous integration: Jenkins

Cloud services: Amazon AWS (EC2 and S3), Rackspace

Networking: 10/100/1000 ethernet switching, Wifi (802.11a/b/g/n), VLANs (802.1q), VPN (IPsec, PPTP, OpenVPN, Juniper SSL)

Databases: MySQL with a focus on master-slave and multi-master replication models

Virtualization/Emulation: Xen, KVM, VMware ESXi, VirtualBox, Wine, Openstack (Havana and Icehouse), Docker

Web development: HTML 4.01, XHTML 1.0, CSS, Javascript (JQuery), PHP frameworks (CakePHP, Zend, and CodeIgniter)


Senior Operations Architect, Turn, Inc. (August 2014 to present)

  • Designed and implemented multi-datacenter Consul cluster with 3500+ nodes
  • Designed and implemented Docker-based replacement for BSD jail-based SFTP system, using Jenkins/Ansible for deployment
  • Deployed Ansible across management nodes with custom dynamic inventory script that queries Foreman
  • Created review/deploy pipeline for DNS changes using Jenkins and Ansible
  • Designed and built Openstack clusters using Ceph storage backend
  • Created HA private Docker registry system with access control
  • Created automated build system for Docker images using Jenkins
  • Designed and built HA Mesos/Marathon cluster with Consul and automatic HAProxy configuration
  • Migrated from Nagios to Icinga and scaled from single instances to master/slave architecture
  • Revamped internal DNS infrastructure, including git workflows and change review
  • Implemented Stackstorm with custom hybrid auth and monitoring auto-remediation

Principal Systems Engineer, Box (March 2011 to August 2014)

  • Primary responsibility is fixing the things I considered "broken" in our environment (supervisor's words)
  • Rearchitected various services for automated deployment and HA
  • Played a major role in taking an "organic" environment to a highly scalable enterprise environment
  • Rebuilt existing nagios system to integrate tightly with Puppet for automatic monitoring configuration
  • Collaborated closely with engineers to track down various issues causing segfaults and performance degredations in our environment
  • Redesigned DNS system to use a series of master and slaves with distribution via git
  • Scaled a single-master Puppet system to 20 masters across multiple environments using nginx/unicorn
  • Instituted change management process using git and Puppet, added validation and automated problem reporting
  • Scaled OpenLDAP across multiple datacenters with two-way replication (limited) using syncrepl
  • Led a large-scale migration from CentOS 5.x to Scientific Linux 6.2
  • Created patch/package management procedures and distributed yum repositories across DCs with change reporting
  • Architected and implemented automated server provisioning using kickstart/cobbler and Puppet
  • Extended Puppet for non-standard functions using custom facts, parser functions, and providers
  • Rearchitected internal git system to be distributed and fully redundant using gitosis
  • Designed and implemented a "bastion" server (SSH and squid) to tightly control access to production environment based on LDAP role group membership
  • Created system to allow external servers (in EC2 and small remote POPs) to be managed securely by internal infrastructure (Puppet, Splunk, OpenTSDB, nagios, etc.)

Senior System Administrator, Announce Media (May 2010 to March 2011)

  • Primary responsibility is general system administration with a focus on MySQL Database administration
  • Implemented Puppet to manage all of our existing servers and automate the build/configuration of new servers. All server types were fully defined in Puppet, so that a newly built server could be up and ready for production within 30 minutes of OS installation
  • Implemented LDAP authentication against an existing Active Directory setup
  • Implemented an automatic sync between AD and OpenLDAP and transitioned to OpenLDAP for auth
  • Manage MySQL replication topology, backup, and administration
  • Created an automatic LDAP to MySQL password synchronization process
  • Implemented nagios and integrated with Puppet for automatic monitoring of servers known to Puppet
  • Built and maintain multiple Hadoop clusters configured for MapReduce and HBase

Support Engineer II, Announce Media (August 2009 to May 2010)

  • Tracked down and fixed bugs discovered in production code
  • Perform on-call duties, which involves responding to alerts from the monitoring system and resolving the issue

Release Engineering Lead, Gentoo Linux (March 2008 to 2012)

  • Coordinated the building of Gentoo release media, including stage tarballs, minimal CDs, and Live CD/DVDs
  • Implemented the autobuild system (a fresh set of install media for multiple architectures is automatically built, signed, and uploaded to the mirrors on a weekly basis)

Developer, Gentoo Linux (November 2004 to March 2008)

  • Developed the Gentoo Linux Installer
  • Created numerous enhancements and bug fixes for Gentoo's catalyst and genkernel tools
  • Developed the Quickstart utility for doing automated Gentoo installations on multiple architectures (x86, x86_64, hppa, and sparc)

Senior Linux System Administrator, Broadstripe (December 2008 to August 2009)

  • General system administration on CentOS, Debian, and Ubuntu servers
  • Implemented Puppet for configuration management
  • Implemented AD authentication for centralized authentication and access control
  • Created DNS cluster for ISP customers and hosting of company-owned domains
  • Eliminated wasteful server usage by consolidating functionality and using virtualization
  • Setup nagios and cacti in order to monitor servers and services
  • Used perl, python, and shell scripting to automate various system administration tasks
  • Maintained an existing Active Directory setup, including performing multiple failed server recoveries and massive cleanup of existing infrastructure
  • Designed, implemented, tested, and debugged customer-facing websites with transactional abilities and scripts to process data

Senior System and Network Administrator, Creative Communications (February 2005 to October 2008)

  • Designed, implemented, tested, and debugged web-based applications for product ordering/payment, internal inventory management (purchase orders, invoicing, etc.), and internal accounting
  • Responsible for Linux server administration including hardware setup, OS installation, server setup/configuration, firewall/router/gateway design/implementation, software upgrades, security updates, and user management

Linux Administrator, Primary Care Computing, LLC (January 2004 to January 2005)

  • Responsible for troubleshooting and repairing hardware, network, and software/operating system related problems for x86-based machines running DOS, Windows, and Linux
  • General Linux server administration including hardware setup, OS installation, server setup/configuration, firewall/router/gateway design/implementation, software upgrades, security updates, and user management
  • Designed web-based database-driven applications

Senior Systems Administrator, Skyline Aeronautics, LLC (October 2001 to August 2005)

  • Responsible for server administration including hardware setup, OS installation, server setup/configuration, firewall/network design/implementation including DHCP and DNS (bind and dnsmasq), software upgrades, and security updates
  • Implemented an e-mail system consisting of SMTP (qmail), POP3, IMAP, and web-based mail for employees
  • Designed, built, and maintaned a web portal for customers to schedule aircraft and instructors for flight training
  • Designed and implemented a Windows NT style domain utilizing Samba 3.0 to take advantage of centralized network logons, roaming profiles, and policies for the publicly accessible computers
  • Responsible for troubleshooting hardware, software, and network problems