Last year’s Gartner’s DevOps Hype Cycle states that DevOps Toolchain Orchestration is moving from the peak of inflated expectations to the trough of disillusionment. This means the market is moving at a fast pace towards actual productivity and scalability.
We understand that we need to build a comprehensive strategy to address DevOps at scale, where continuous integration and continuous delivery (CI/CD) is core to effectiveness.
In this model, everyone gets what they want: security and control for operators, freedom and speed for developers. As we tread towards the DevOps slope of enlightenment and the plateau of productivity, we have to choose between a few approaches to DevOps and make some critical decisions to ensure effectiveness.
RELATED CONTENT: Getting it right: Self-service in cloud automation
Below are some foundational questions to consider, along with recommendations based on our experiences with customers and the current state of DevOps.
SD Times:Should I approach CI and CD separately?
Rob Zuber, CTO of CircleCI: When it comes to solving CI and CD challenges, they’re tightly coupled and should be approached together. The ability to leverage a tool that understands both is valuable, especially as it relates to change validation. If all you understand is the current state of something versus how you got to that state, it’s much more difficult to get useful feedback back into the process, or to respond effectively when something goes wrong.
The tooling that allows us to do that, such as CI/CD with comprehensive test coverage, gives us the confidence to move quickly because we know we will not deploy anything to a production environment until it’s been tested and validated. By thoroughly testing code before it ever reaches production, we’re able to maintain the benefits of more lightweight planning cycles, shorter feedback loops with realtime user feedback, all with higher confidence in our code and reduced risk. This has been a good thing.
Maya Ber Lerner, CTO of Quali: This also holds true from an automation perspective. The most important thing in automation is reusability, which brought the Quali team to approach CI and CD together from an infrastructure perspective. When you think about all of the effort that goes into automating processes for testing or development, for example, why shouldn’t we use the same automation for production?
Should I go for a one size fits all solution, or build a solution myself using open source?
Ber Lerner: From my experience, you still need someone to own the overall platform. So it’s really about finding the tools and components that give you the capabilities you need across the value stream. It’s more about finding these layers and deciding: How are you going to do CI/CD throughout the value stream? How are you going to do secret management throughout the value stream? How are you going to do artifact management throughout the value stream? Then it becomes more horizontal rather than chaining tools together.
Zuber: There’s an interesting balance in the freedom of choice versus the consistency of standardization. I’m an engineer by background, so I love to tinker and deeply understand how things work. I’m also an engineering leader and at the end of the day I have to think about what delivers the most value to my customers. Being able to use a small set of tools, or have someone manage those tools for me in a way that’s going to enable me to do what is core to my business, is always what I’m striving for.
How should I approach application deployment and infrastructure provisioning throughout CI/CD?
Ber Lerner: For many companies it’s no longer the way it used to be, that you’re looking at applications as just artifacts that move downstream in the CI/CD pipeline. Today, we usually know where these artifacts are going to be deployed, they each have a place. That’s different from 10 years ago where we needed to figure out all the different permutations where our artifacts could be installed.
Now, we know if it’s going into our production environment, we understand that the production environment is our business. And the production environment is not just the artifacts. It’s also the infrastructure which needs to be handled in the same way. It makes it easier to look at these bundles or packages of applications that are hosted on infrastructure, with the data that they’re going to need, and look at this entity and know who is in charge of it.
We view the production environment as a whole and we can track changes in the production environment. It’s not true for everyone, but in general, this is where we see this going with Infrastructure as Code and Immutable Infrastructure making it more feasible.
Zuber: That point about immutability is important. Despite all of the chaos around vendors in the evolution of containerization, the approach really was a game-changer in how we think about operating environments. Now, I know exactly the environment in which a piece of code is going to operate and now that’s the environment that I run in production. Having the same libraries installed in the same locations on the same versions is a very big change and huge improvement in the software delivery workflow.
Validating the entire container image for your application via CI and then deploying that on top of infrastructure that has also been through a similar validation cycle minimizes any sources of unexpected change in your production environment.
How do we include security and quality in DevOps?
Ber Lerner: From an automation perspective, one of the things that security and testing have in common is their level of complexity. I think everyone is looking at the production environment and says they need the production environment earlier. When you’re trying to break silos, you’re trying to introduce security and quality teams that may not have the same coding abilities as others in the process, and it’s easy to forget about their agendas.
Allowing everyone to be included in the process and have access to some of these capabilities, even if they are not infrastructure as code magicians, or don’t really understand how some of it works, is essential to streamlining security and quality control.
Zuber: Like most other areas of software validation, there is great value in moving security testing earlier in our pipelines. Ideally, we design with security in mind, so it starts with the developer. Then we can use automation in the delivery pipeline like vulnerability scanning, static analysis, dynamic analysis, and fuzzing to catch issues early. The cost is always lower if you can identify and fix those issues earlier in the process.
One interesting facet of security, though, is with all the 3rd-party dependencies included in software deployments these days, it’s quite possible for a vulnerability to be discovered in a library you are using even though you’re not making changes to your software. So, it’s important to have a comprehensive scanning or tracking program to catch these and the automation to quickly update and redeploy with the necessary fixes.
What are the right measurable goals for this process?
Zuber: To me, it ties back to confidence. I need reasonable confidence in what I’m shipping, and the added confidence that if I missed something I can recover quickly. That’s a huge part of what we’ve achieved with the DevOps mentality and CI/CD in particular.
There has been a lot of focus on the “Accelerate” metrics lately: lead time, deployment frequency, mean time to recovery (MTTR), and the change fail percentage. And they make a lot of sense. Much of what these measure is around the what we see in CI/CD every day:
- Lead time for changes –> workflow duration
- Deployment frequency –> how often you kick off a workflow
- MTTR –> the time it takes to get from red to green
- Change fail percentage –> workflow failure rate
Optimizing these four key metrics leads to tremendous advantages and will be sure to enhance your team’s performance.
Ber Lerner: DevOps is a lot about balancing speed and risk. In the initial years it was all about releasing fast. Speed was the first thing that you measure – for example deployment frequency. But as DevOps gets more mature you need to make sure that you don’t create extra risk, you start looking into operational measurables like Mean Time to Recover and Change Failure Rate.
One of the challenges we see with many enterprises going through this journey is the ability to measure the effectiveness of their devops strategy, including the resource savings, Mean Lead Time for Changes and application quality. That involves creating a baseline and tracking progress after the platform has been rolled out.
Something that is interesting to a lot of our clients is how much the infrastructure costs throughout the value stream delivery, and if it’s possible to optimize it. So it’s not just about being very fast, but it’s also about doing things in a way that is very secure and very cost effective and at the end of the day, makes you more competitive.