Blog • Insights
Local Development in the Age of Containers

September 5, 2019

Managing an organization’s technical development is a long and winding road. Depending on the size and complexity of what is required over time, how development is managed locally can go from straight-forward to overwhelming overnight. However, as containers become a smart solution for technical growth, the anxiety of managing development becomes much more feasible and secure. First, let’s start with a brief history of local development at Forum One. Our local development journey mirrors that of most organizations: as our tech team grew, so too did the number of possible configurations of local computers. The end result being that we needed to find better, smarter, more secure solutions manage and deploy our work locally. Many organizations may not have a clear strategy on how their manage their local development, and to be fair, it’s easy to not have a process for local environments altogether. In our case, our sites typically use PHP-based content management systems, and the platforms they require are extremely common — so common, in fact, that they have their own acronyms:

LAMP (Linux, Apache, MySQL, and PHP)
MAMP (ditto, but for MacOS)
WAMP (ditto, but for Windows)

On the surface, for local development managers, these systems offer a fairly seductively-easy solution. The web server can be installed with a single click from a package; tools like MAMP PRO offer graphical management interfaces; and running a site on a laptop is as easy as cloning from GitHub and pointing a server to it. But this quickly creates real problems. If there is only one system-wide installation of a web server and database, developers must manually edit the server’s configuration, then restart it, and ensure that the database is configured so that two sites don’t accidentally begin sharing data. Worse than that is “configuration drift.” As each developer maintains their own environment, configuration and extensions can vary between two local setups, or between local setups and production servers. This drift can cause subtle problems that are hard to debug and may not even be detected until a push to dev. Configuration changes — whether small ones like increasing the memory limit, or large ones like installing a new PHP extension — that aren’t replicated on the production servers can cause issues that only occur sporadically, since not all requests will require the changed resource.

Installing virtual machines

Vagrant is a command-line tool for creating and managing virtual machines using simple configuration files. Virtual machines carry extremely useful isolation guarantees as well so that different projects result in different virtual machines. Each virtual machine contains only the code and data for a particular project, with zero risk of “contaminating” other projects with old data or misapplied configurations. This eases the first of our headaches, which is that now we can run multiple sites and servers on the same machine. Vagrant’s configuration files enable a process called provisioning, which is essentially a fancy word for “install and configure.” Through provisioning, we accomplish two things very easily:

We are able to ship updates to the base system and simply re-provision
We have a reproducible way to recreate virtual machines in the event the system state becomes corrupted

Originally, our Vagrant provisioner of choice was Puppet. Puppet has a declarative configuration language that enabled us to roll out virtual machine configuration in a fairly predictable way. However, our server provisioning was eventually migrated to a system called SaltStack. We replaced the Puppet provisioner with a SaltStack one that mimicked the production server environment. The only major differences between the two were that our virtual machines were instead installed on a local MySQL server, rather than using a hosted solution (such as AWS RDS). Under the hood, both Puppet and SaltStack use the operating system’s package manager. This gave us access to CentOS’ RPM packages and allowed us to quickly roll out configuration to install the necessary tools for local development: PHP, MySQL, Solr, Memcached, and Node.js.

Running into new problems

However, as we continued to use Vagrant, we ran into a number of other issues that crept up. Provisioning an entire virtual machine is extremely slow. In some cases, downloading, installing, and configuring a single system could take up to 20 minutes on some machines. This process had to be repeated by each developer for each new project. That much lost time quickly adds up. As projects aged, we also ran into extremely difficult-to-fix packaging problems. Since we tracked the main RPM repositories, we ran into issues when older projects could no longer install outdated PHP versions. This could happen to projects that didn’t yet have an update scheduled, or had recently come back into our patching and support system. Fixing this issue is not as easy as simply changing the version due to the number of incompatible changes introduced in PHP 7. If we provisioned a site with a new version of PHP, we weren’t guaranteed that the site would run because of the breaking changes. Having to address that often meant updating modules, which brought on its own set of changes. This was of particular concern for long-running projects. If a Vagrant VM hadn’t been recently provisioned by a tech lead, a stale package might be running. For the website manager, the local site would still work perfectly; however, new developers joining the project would be unable to create their own VMs. Finally, we had some issues with SaltStack itself. Salt requires what it calls formulas — that is, packages of Salt templates — to provision. This meant that, even if a developer was familiar with the underlying package, they had to learn how to use Salt to interact with it. If we needed to customize Salt to support a package we didn’t normally use, such as Python to support Django-based sites, we quickly ran into problems where people applied hacky solutions that created uniqueness problems.

Containers to the rescue

As we continued to grow as a tech team, it became clear that our Vagrant usage was becoming unsustainable. We needed a solution where we could use templates to support common project needs, while giving us room to support less-common configurations (e.g., Python) and grow into more sophisticated architectures (e.g., decoupled sites). Due to these requirements, we decided to standardize on containers. Docker is a container solution based on a few central pieces: images and containers. An image is a self-contained package containing the resources (executables, libraries, default configuration) needed to run a service. When we can, we use off-the-shelf images from the Docker Hub, a community repository of trusted images, but a feature of Docker is that new images can be built from other images — e.g., we can build a customized Node.js server by starting from the Docker Hub’s official Node image. After images are built, they are used by Docker to create a container to act as a running instance of the image.

Learn more: Docker for Nonprofits: Bringing Speed, Agility and Security to Your Mission

In many ways, containers act like virtual machines. They are isolated from the rest of the system and can only see the resources granted to them by the supervising system (the Docker engine or the VM hypervisor). Containers communicate with other services over network protocols, which are easier to monitor, secure, and load balance. One crucial difference between the two concepts is that a container’s file system is temporary: it lasts only as long as the container does. Despite the drawbacks, a container has a number of benefits. Changes to a running system are effectively reverted once a container exits, which requires developers and users to track a desired configuration elsewhere — whether that be in code as part of the build, or recorded in an external database with which other containers can synchronize.

Navigating container challenges

But as we have seen in previous system changes, there is a downside here as well. Where we once had a single VM containing an entire project’s worth of dependencies, we now have multiple individual pieces. Starting a container project therefore involves:

Acquiring all of the images necessary for the web server, database server, search server, and cache server
Building any custom images
Starting each service with the appropriate configuration and runtime settings
Granting each service a way to communicate with others while restricting arbitrary access.

Luckily, there is a tool to automate this task: Docker Compose. Docker Compose is a command-line tool to build and manage a cluster of services (“cluster” in this case is just a fancy word for “group”). With this tool, cluster management is reduced to two frequently-used commands: an “up” command to start the cluster and generate any needed resources, and a “down” command to stop services and destroy containers. Docker Compose manages its configuration in a single file. This file gives us a way to specify the cluster state, and we have found that it is easy even for newcomers to make changes to the configuration. Because Docker Compose knows it’s managing a cluster, it creates an isolated virtual network for all services to communicate. It then becomes very easy for systems to communicate in the same manner they would in a production system. To ease local development, our Docker Compose files usually include Docker Volumes, a mechanism that creates encapsulated storage elements. Volumes are attached to containers in order to provide persistent storage. We use this mainly to preserve database state, because without it, developers would have to install a new CMS every time they started a project. Unlike VMs, however, these volumes are separate from the service itself; we can destroy the volume to revert the system to a clean slate, and the container will perform a self-initialization process. This allows us to observe systems in various states without having to wait for a time-consuming re-provisioning process.

Where do we go from here?

Inevitably, complexities will arise and local development will go through further iterations in the future. In my next post on Web Starter (coming soon!), I will delve into how we’re leveraging Web Starter to provision Vagrant VM. In any event, based on where we (and many organizations) are today in the ongoing quest to manage local development easily, efficiently, and securely, exploring and testing new ways to operate is key. Patching and hacking ‘the way it’s always been done’ can sometimes feel like the safest path of least resistance, but already with the solutions that containers provide, local development has moved to a new, better level.

Blog • Insights Local Development in the Age of Containers