Me Andrew Gaffney
Email: andrew@agaffney.org
IRC: agaffney (on Freenode and OFTC)
Home
   
Professional
  Resume (HTML)
  Resume (PDF)
  LinkedIn Profile
   
Projects
  GitHub
   
   


Andrew M. Gaffney
St. Charles, MO 63304
andrew@agaffney.org
Printable version

SKILLS

Operating Systems: Linux (Slackware, Gentoo, Ubuntu/Debian, RHEL/CentOS/Scientific, CoreOS), FreeBSD

Programming Languages: Perl, C, C++/Visual C++, Java, BASIC/Visual Basic, Python, PHP, Shell/Bash, Ruby, Go, Lua

Monitoring, Metrics, and Security: Nagios/Icinga 1.x, Sensu, Cacti, Zabbix, OSSEC, Graphite

Cloud services: Amazon AWS (EC2, VPC, S3, RDS, ElastiCache, IAM, KMS, DynamoDB), Rackspace

Databases: MySQL (with a focus on master-slave and multi-master replication models), PostgreSQL, Redis, MongoDB

Virtualization, Emulation, and Containerization: Xen, KVM, VMware ESXi, VirtualBox, Wine, Openstack (Havana, Icehouse, Ocata), Docker

EXPERIENCE

Senior Infrastructure Architect, Perspica, Inc. (July 2017 to present)

  • Revamped existing image build automation using Packer to properly support AMI and OVF targets
  • Designed and implemented automation using Ansible to create "on-premise" application cluster in AWS that uses centralized management and monitoring
  • Revamped existing Puppet master, Sensu, Graphite, and other infrastructure services

Contract DevOps Engineer, BitPusher, LLC (February 2017 to August 2017)

  • Designed and implemented customized Openstack Ocata deployment using openstack-ansible project and custom playbooks for host network configuration and post-build Openstack configuration
  • Designed and implemented single git repo Terraform setup to manage multiple AWS accounts, multiple environments/stacks, and multiple pieces of stacks, utilizing S3/DynamoDB remote Terraform state

Senior Operations Architect, Turn, Inc. (August 2014 to present)

  • Designed and implemented multi-datacenter Consul cluster with 3500+ nodes
  • Created auto-remediation system with Slack integration using Stackstorm
  • Designed and implemented Docker-based replacement for BSD jail-based SFTP system, using Jenkins/Ansible for deployment
  • Deployed Ansible across management nodes with custom dynamic inventory script that queries Foreman
  • Created review/deploy pipeline for DNS changes using Jenkins and Ansible
  • Designed and built Openstack clusters using Ceph storage backend
  • Created HA private Docker registry system with access control
  • Created automated build system for Docker images using Jenkins
  • Designed and built HA Mesos/Marathon cluster with Consul and automatic HAProxy configuration
  • Migrated from Nagios to Icinga and scaled from single instances to master/slave architecture
  • Revamped internal DNS infrastructure, including git workflows and change review
  • Implemented Stackstorm with custom hybrid auth and monitoring auto-remediation
  • Rebuilt existing infrastructure with automated deployment, backups, and proper monitoring/metrics

Principal Systems Engineer, Box (March 2011 to August 2014)

  • Primary responsibility is fixing the things I considered "broken" in our environment (supervisor's words)
  • Rearchitected various services for automated deployment and HA
  • Played a major role in taking an "organic" environment to a highly scalable enterprise environment
  • Rebuilt existing nagios system to integrate tightly with Puppet for automatic monitoring configuration
  • Collaborated closely with engineers to track down various issues causing segfaults and performance degredations in our environment
  • Redesigned DNS system to use a series of master and slaves with distribution via git
  • Scaled a single-master Puppet system to 20 masters across multiple environments using nginx/unicorn
  • Instituted change management process using git and Puppet, added validation and automated problem reporting
  • Scaled OpenLDAP across multiple datacenters with two-way replication (limited) using syncrepl
  • Led a large-scale migration from CentOS 5.x to Scientific Linux 6.2
  • Created patch/package management procedures and distributed yum repositories across DCs with change reporting
  • Architected and implemented automated server provisioning using kickstart/cobbler and Puppet
  • Extended Puppet for non-standard functions using custom facts, parser functions, and providers
  • Rearchitected internal git system to be distributed and fully redundant using gitosis
  • Designed and implemented a "bastion" server (SSH and squid) to tightly control access to production environment based on LDAP role group membership
  • Created system to allow external servers (in EC2 and small remote POPs) to be managed securely by internal infrastructure (Puppet, Splunk, OpenTSDB, nagios, etc.)

Senior System Administrator, Announce Media (May 2010 to March 2011)

  • Primary responsibility was general system administration with a focus on MySQL Database administration
  • Implemented Puppet to manage all of our existing servers and automate the build/configuration of new servers. All server types were fully defined in Puppet, so that a newly built server could be up and ready for production within 30 minutes of OS installation
  • Implemented LDAP authentication against an existing Active Directory setup
  • Implemented an automatic sync between AD and OpenLDAP and transitioned to OpenLDAP for auth
  • Manage MySQL replication topology, backup, and administration
  • Created an automatic LDAP to MySQL password synchronization process
  • Implemented Nagios and integrated with Puppet for automatic monitoring of servers known to Puppet
  • Built and maintained multiple Hadoop clusters configured for MapReduce and HBase

Support Engineer II, Announce Media (August 2009 to May 2010)

  • Tracked down and fixed bugs discovered in production code

Senior Linux System Administrator, Broadstripe (December 2008 to August 2009)

  • Implemented Puppet for configuration management
  • Implemented AD authentication on Linux for centralized authentication and access control
  • Created DNS cluster for ISP customers and hosting of company-owned domains
  • Eliminated wasteful server usage by consolidating functionality and using virtualization
  • Setup Nagios and Cacti in order to monitor servers and services
  • Used Perl, Python, and shell scripting to automate various system administration tasks
  • Maintained an existing Active Directory setup, including performing multiple failed server recoveries and massive cleanup of existing infrastructure
  • Designed, implemented, tested, and debugged customer-facing websites with transactional abilities and scripts to process data

Senior System and Network Administrator, Creative Communications (February 2005 to October 2008)

  • Designed, implemented, tested, and debugged web-based applications for product ordering/payment, internal inventory management (purchase orders, invoicing, etc.), and internal accounting
  • Led migration from hardware terminals to Linux thin clients utilizing a custom telnet-to-LAT gateway to interact with existing MicroVAX systems
  • Responsible for Linux server administration including hardware setup, OS installation, server setup/configuration, firewall/router/gateway design/implementation, software upgrades, security updates, and user management

Senior Systems Administrator, Skyline Aeronautics, LLC (October 2001 to August 2005)

  • Responsible for server administration including hardware setup, OS installation, server setup/configuration, firewall/network design/implementation including DHCP and DNS (bind and dnsmasq), software upgrades, and security updates
  • Implemented an e-mail system consisting of SMTP (qmail), POP3, IMAP, and web-based mail for employees
  • Designed, built, and maintaned a web portal for customers to schedule aircraft and instructors for flight training
  • Designed and implemented a Windows NT style domain utilizing Samba 3.0 to take advantage of centralized network logons, roaming profiles, and policies for the publicly accessible computers

OPEN SOURCE

Ansible

  • Code contributor ("actionable" stdout callback plugin, loop item labels, "piped" SSH transfer method, "openwrt_init" module, various docs fixes)
  • Active community member (answering questions in the #ansible IRC channel on Freenode)

Gentoo Linux

Release Engineering Lead (March 2008 to 2012)

  • Coordinated the building of Gentoo release media, including stage tarballs, minimal CDs, and Live CD/DVDs
  • Implemented the autobuild system (a fresh set of install media for multiple architectures is automatically built, signed, and uploaded to the mirrors on a weekly basis)

Developer (November 2004 to March 2008)

  • Developed the Gentoo Linux Installer
  • Created numerous enhancements and bug fixes for Gentoo's catalyst and genkernel tools
  • Developed the Quickstart utility for doing automated Gentoo installations on multiple architectures (x86, x86_64, hppa, and sparc)