In November of 2015, my three-person team was given a rather high-level, “make-our-development-process-better” mission and we were struggling to figure out how to start. The CTO decided to stop by and share his vision of how we would develop software going forward. His goal was to get to Continuous Deployment; to be deploying to our production servers every 30 minutes. To put this into perspective, at that point we deployed hotfixes weekly and major feature releases twice a year — and we were supposed to get to Continuous Deployment; pushing code every 30 minutes every day. While we haven’t gotten to Continuous Deployment yet (it’s in the works!), we have managed to get to Continuous Integration, as well as deploying every day. There is no shortage of articles online about the pros and cons of Continuous Integration/Delivery/Deployment, so instead of reiterating those, I want to go over how we moved from our former, manual testing heavy process to Continuous Integration. There were two major things we had to do: change the process and change the culture.
We provided a straightforward path for getting code from a developer’s machine into production. Rather than our former process which required a formal review meeting with a gatekeeper, we introduced a Pre-Merge Pipeline. While this started with just unit tests and linting, it’s been easier to add more tests with the infrastructure in place. The idea is that any tests that can be run in under five minutes can be added to the Pre-Merge. Now anyone could merge their changes to master, as long as it passed the tests.
Unfortunately, not all of our tests can run in under five minutes, so we needed another step. We added the (unoriginally named) Post-Merge Pipeline. This deploys our code to a server and runs our longer running tests, and on success, tags the revision as stable. This started just running web automation, but since then we’ve added API tests as well as environment checks. Since this currently takes somewhere between 45 minutes to an hour, it usually picks up more than one changeset, but that’s good enough for us right now. First thing every morning, a job runs that grabs the most recent stable tag, and saves it as the deploy tag for that day. With those jobs configured and running, there was a clear path for any developer to get their code from their machine to production. Similar to the Pre-Merge, it’s been much easier to introduce more tests with the infrastructure in place.
Introducing the new process without changing the culture and how we write code would have led to a higher bug count, as we had become reliant on our Software Test team to catch any issues during our extended manual testing periods. With those integration times being eliminated, and incomplete automation test coverage, developers needed to understand the full impact of their changes, as we couldn’t rely on blanket test coverage. We took our Development team into a room and told them they had the power to merge their changes, they had the responsibility to make sure they had been tested before they did and that they would be responsible for any bugs they created.
Although we encourage our developers to work with the test team to test their changes, we also made sure they knew that the preferred solution would always be automated tests. The only way we would be able to add more functionality to our platform while lowering the amount of manual testing required would be to add more automated tests. We got buy-in from our Product team that before changing any code that didn’t have unit tests, the unit tests would need to be added, even if that meant a significant refactoring period. Adding automated tests for your changes, whether they were unit tests, API tests, etc. would be the best way to get your changes out quickly.
The most important step we took was eliminating the old processes. This was to make sure that if something happened to our pipeline, or any of the tools that were a part of it, that we wouldn’t slide back into our old habits. Due to our refusal to move backwards, even though we’ve had issues with Jenkins and some of our other tools, we’ve deployed almost every day since February 2016. Our processes continue to evolve, tests are being added to the pipeline, monitoring is being added both to the pipeline and to production, but the culture changing has had the biggest impact. Our team has embraced Continuous Integration and daily deployment and has adapted their workflow to reflect that.
The Journey Ahead
Our journey towards Continuous Deployment is far from over, but we have come a long way. Back in November, I couldn’t imagine how far we would come since then. We’ve revolutionized both how we develop software and how we deploy it. I learned two main lessons from this adventure: the first being that changing the culture was much more important than changing the process. Developers are problem solvers at heart and will find ways around processes that they don’t find necessary or worthwhile, so having the team buy-in was huge. The second lesson is don’t be afraid of big problems. When we first discussed moving to Continuous Deployment, I started off trying to think of how we could get there all at once. But, after breaking the problems down to smaller and smaller problems, then solving those, getting to the end goal became much less intimidating.
If you liked this article, feel free to click the heart to recommend this to others or follow me on twitter @ScottyGolds.