Software innovation doesn’t happen without taking risks along the way. But risks can be scary for businesses afraid of making mistakes.
There is another way, according to Jon Noronha, senior vice president of product at Optimizely, a progressive delivery and experimentation platform provider. Feature experimentation, he said, allows businesses to go to market quicker while improving product quality and minimizing the fear of failure.
“I like to think of feature experimentation as a safety net. It’s something that gives people the confidence to do something bold or risky,” he said. “Imagine you are jumping on a trapeze with no net. You’re going to be really scared to take even the smallest step because if you fall, you’re going to really hurt yourself. When there is a net, you know the worst thing that can happen is you land on the net and bounce a little bit.”
Waving the flag for feature experimentation
Speed releases with feature flags
Feature experimentation is that net that allows you to leap, but catches you if you fall, Noronha explained. It enables businesses to take small risks, roll it out to a few users, and measure the impact of changes before releasing it to 100% of the user base.
Christopher Condo, a principal analyst at the research firm Forrester, said, “In order to be innovative, you need to really understand what your customers want and be willing to try new experiences. Using feature experimentation allows businesses to be more Agile, more willing to put out smaller pieces of functionality, test it with users and continue to iterate and grow.”
However, there are still some steps businesses need to take before they can squeeze out the benefits of feature experiment. They need to learn to walk before they can run.
Progressive Delivery: Walk
Progressive delivery is the walk that comes before the run (feature experimentation), according to Dave Karow, continuous delivery evangelist at Spilt, a feature flag, experimentation and CD solution provider. Progressive delivery assumes you have the “crawl” part already in place, which is continuous delivery and continuous integration. For instance, teams need to have a centralized source of information in place where developers can check in code and have it automatically tested for basic sanity with no human intervention, Karow explained.
Without that, you won’t see the true promise of progressive delivery, John Kodumal, CTO and co-founder of LaunchDarkly, a feature flag and toggle management company, added.
“Imagine a developer is going to work on a feature, take a copy of the source code and take a copy of their plan and work on it for some time. When they are done, they have to merge their code back into the source code that is going to go out into production,” Karow explained. “In the meantime, other developers have been making other changes. What happens is literally referred to in the community as ‘merge hell.’ You get to a point where you think you finished your work and you have to merge back in and then you discover all these conflicts. That’s the crawl stuff. It’s about making changes to the software faster and synchronizing with coworkers to find problems in near real-time.”
Once you have the crawl part situated, the progressive delivery part leverages feature flags (also known as feature toggles, bits or flippers) to get features into production faster without breaking the application. According to Optmizely’s Noronha, feature flags are one layer off the safety net that feature experimentation offers. It allows the development teams to try things at lower risks and roll out by slowly and gradually enabling developers to expose key functionalities with the goal of catching bugs or errors before they become widespread. “It’s making it easier to roll things out faster, but be able to stop rollouts without a lot of drama,” Karow said.
Some examples of feature flags
Feature flags come in several different flavors. Among them are:
- Release flags that enable trunk-based development. “Release Toggles allow incomplete and un-tested codepaths to be shipped to production as latent code which may never be turned on,” Pete Hodgson, an independent software delivery consultant, wrote in a post on MartinFowler.com.
- Experiment flags that leverage A/B testing to make data-driven optimizations. “By their nature Experiment Toggles are highly dynamic – each incoming request is likely on behalf of a different user and thus might be routed differently than the last,” Hodgson wrote.
- Ops flags, which enable teams to control operational aspects of their solution’s behavior. Hodgson explained “We might introduce an Ops Toggle when rolling out a new feature which has unclear performance implications so that system operators can disable or degrade that feature quickly in production if needed.”
- Permission flags that can change the features or experience for certain users. “For example we may have a set of ‘premium’ features which we only toggle on for our paying customers. Or perhaps we have a set of “alpha” features which are only available to internal users and another set of “beta” features which are only available to internal users plus beta users,” Hodgson wrote.
One way to look at it is through the concept of canary releases, according to Kodumal, which is the idea of being able to release some change and controlling the exposure of that change to a smaller audience to validate that change before rolling it out more broadly.
These flags help minimize the blast radius of possible messy situations, according to Forrester’s Condo. “You’re slowly gauging the success of your application based on: Is it working as planned? Do customers find it useful? Are they complaining? Has the call value gone up or stayed steady? Are the error logs growing?” As developers implement progressive delivery, they will become better at detecting when things are broken, Condo explained.
“The first thing is to get the hygiene right so you can build software more often with less drama. Implement progressive delivery so you can get that all the way to production. Then dip your toes into experimentation by making sure you have that data automated,” said Split’s Karow.
Feature experimentation: Run
Feature experimentation is similar to progessive delivery, but with better data, according to Karow.
“Feature experimentation takes progressive delivery further by looking at the data and not just learning whether or not something blew up, but why it did,” he said.
By being able to consume the data and understand why things happen, it enables businesses to make better data-driven decisions. The whole reason you do smaller releases is to actually confirm they were having the impact you were looking for, that there were no bugs, and you are meeting users’ expectations, according to Optmizely’s Noronha.
It does that through A/B testing, multi-armed bandits, and chaos experiments, according to LaunchDarkly’s Kodumal. A/B testing tests multiple versions of a feature to see how it is accepted. Multi-armed bandits is a variation of an A/B test, but instead of waiting for a test to complete it uses algorithms to increase traffic allocations to see how features work. And chaos experiments refer to finding out what doesn’t work rather than looking for what does work.
“You might drive a feature experiment that is intended to do something like improve engagement around a specific feature you are building,” said Kodumal. “You define the metric, build the experiment, and validate whether or not the change being made is being received positively.”
The reason why feature experimentation is becoming so popular is because it enables development teams to deploy code without actually turning it on right away. You can deploy it into production, test it in production, without the general user base seeing it, and either release it or keep it hidden until it’s ready, Forrester’s Condo explained.
In some cases, a business may decide to release the feature or new solution to its users, but give them the ability to turn it on or off themselves and see how many people like the enhanced experience. “Feature experimentation makes that feature a system of record. It becomes part of how you deliver experiences to your customers in a varied experience,” said Condo. “It’s like the idea of Google. How many times on Google or Gmail has it said ‘here is a brand new experience, do you want to use it?’ And you said ‘no I’m not ready.’ It is allowing companies to modernize in smaller pieces rather than all at once.”
What feature experimentation does is it focuses on the measurement side, while progressive delivery focused on just releasing smaller pieces. “Now you are comparing the 10% release against the other 90% to see what the difference is, measuring that, understanding the impact, quantifying it, and learning what’s actually working,” said Opitmizely’s Noronha.
While it does reduce risks for businesses, it doesn’t eliminate the chance for failure. Karow explained businesses have to be willing to accept failure or they are not going to get very far. “At the end of the day, what really matters is whether a feature is going to help a user or make them want to use it or not. What a lot of these techniques are about is how do I get hard data to prove what actually works,” Karow explained.
To get started, Noronha recommends to look for parts of the user experience that drive traffic and make simple changes to experiment with. Once they prove it out and get it entrenched in one area, then it can be quickly spread out to other areas more easily.
“It’s sort of addictive. Once people get used to working in this way, they don’t want to go back to just launching things. They start to resent not knowing what the adoption of their product is,” he said.
Noronha expects progressive delivery and feature experimentation will eventually merge. “Everyone’s going to roll out into small pieces, and everyone’s going to measure how those things are doing against the control,” he said.
“What both progressive delivery and feature experimentation do is provide the ability to de-risk your investment in new software and R&D. They give you the tooling you need to think about decomposing those big risky things into smaller, achievable things where you have faster feedback loops from customers,” LaunchDarkly’s Kodumal added.
Experimenting with A/B testing
A/B testing is one of the most common types of experiments, according to John Kodumal, CTO and co-founder of LaunchDarkly, a feature flag and toggle management company
It is the method of comparing two versions of an application or functionality. Previously, it was more commonly used for front-end or visual aesthetic changes done to a website rather than a product. For instance, one could take a button that was blue and make it red, and see if that drives more clicks, Jon Noronha, senior vice president of product at Optimizely, a progressive delivery and experimentation platform provider, explained. “In the past several years, we’ve really transitioned to focusing more on what I would call feature experimentation, which is really building technology that helps people test the core logic of how their product is actually built,” he said.
A/B testing is used in feature experimentation to test out two competing theories and see which one achieves the result the team is looking for. Christopher Condo, a principal analyst at the research firm Forrester, explained that “It requires someone to know and say ‘I think if we alter this experience to the end user, we can improve the value.’ You as a developer want to get a deeper understanding of what kind of changes can improve the UX and so A/B testing comes into play now to show different experiences from different people and how they are being used.”
According to Dave Karow, continuous delivery evangelist at Spilt, a feature flag, experimentation and CD solution provider, this is especially useful in environments where a “very important person” within the business has an opinion or the “highest paid person” on the team wants you to do something and a majority of the team members don’t agree. He explained normally what someone thinks is going to work, doesn’t work 8 or 9 times out of 10. But with A/B testing, developers can still test out that theory, and if it fails they can provide metrics and data on why it didn’t work without having to release it to all their customers.
A good A/B test statistical engine should be able to tell you within a few days which experience or feature is better. Once you know which version is performing better, you can slowly replace it and continue to iterate to see if you can make it work even better, Condo explained.
Kodumal explained A/B testing works better with feature experimentation because in progressive delivery the customer base you are gradually delivering to is too small to run full experiments on and achieve the statistical significance of a fully rigorous experiment.
“We often find that teams get value out of some of the simpler use cases in progressive delivery before moving onto full experimentation,” he said.
Feature experimentation is for any company with user-facing technology
Feature experimentation has already been used among industry leaders like eBay, LinkedIn and Netflix for years.
“Major redesigns…improve your service by allowing members to find the content they want to watch faster. However, they are too risky to roll out without extensive A/B testing, which enables us to prove that the new experience is preferred over the old,” Netflix wrote in a 2016 blog post explaining its experimentation platform.
Up until recently it was only available to those large companies because it was expensive. The alternative was to build your own product, with the time and costs associated with that. “Now there is a growing marketplace of solutions that allow anyone to do the same amount of rigor without having to spend years and millions of dollars building it in-house,” said Dave Karow, continuous delivery evangelist at Spilt, a feature flag, experimentation and CD solution provider
Additionally, feature experimentation used to be a hard process to get started with, with no real guidelines to follow. What has started to happen is the large companies are getting to share how their engineering teams operate and provide more information on what goes on behind the scenes, according to Christopher Condo, a principal analyst at the research firm Forrester. “In the past, you never gave away the recipe or what you were doing. It was always considered intellectual property. But today, sharing information, people realize that it’s really helping the whole industry for everybody to get better education about how these things work,” Condo said.
Today, the practice has expanded into something that every major company with some kind of user-facing technology can and should take advantage of, according to Jon Noronha, senior vice president of product at Optimizely, a progressive delivery and experimentation platform provider.
Norona predicts feature experimentation “will eventually grow to be adopted the same way we see things like source control and branching. It’s going to go from something that just big technology companies do to something that every business has to have to keep up.”
“Companies that are able to provide that innovation faster and bring that functionality that consumers are demanding, they are the ones that are succeeding, and the ones that aren’t are the ones that are left behind and that consumers are starting to move away from,” John Kodumal, CTO and co-founder of LaunchDarkly, a feature flag and toggle management company, added.