The performance of world-class athletes as they train for the Olympic Games and engineering the performance of your applications in preparation for production releases share some interesting traits. In this blog, I want to share some basic guiding principles that will help make your applications finish strong.
Efficiency, Not Just Speed
Not every Olympic sport is a speed race. There are plenty of other areas where winning is about strength or agility. All events aren’t equal in terms of physical demands, nor do all events focus on the same kind of capability. Efficiency is the common thread. Whether you are skiing down a slope or lifting a heavy weight, generally the more efficiently the tasks are carried out, the better the score. When it comes to applications, the same is true.
Many times, performance engineering as an activity is all about measuring how fast something is or how fast it can be tuned to be. However, speed for the sake of speed can lead to other issues. I have heard many clients state that their main goal was to make performance testing automated and repeatable in a Continuous Integration (CI) pipeline. Building out this automation so it is a faster way to test isn’t valuable in and of itself. It is about getting performance feedback as soon as possible, and in an automated fashion. What if this fast, automated testing CI pipeline doesn’t find performance problems, but your organization still wrestles with performance issues in production? It may be that what you automated to run so fast isn’t the right thing.
What happens when making things faster is actually a failure? What if the timing is faster because important verifications in the code were not included in the latest build by accident? The timing improves, but the verification check that is missing means critical data could be lost. In this case, faster is not better.
Instead of focusing on speed, focus on efficiency. This means reducing the amount of toil and repetitive tasks that computers are better at. This doesn’t replace the need for manual review at key intersections within the software development life cycle. Someone who understands the code still needs to verify the results. It also means utilizing every resource to its full potential, while containing infrastructure operational costs and keeping the cloud bills low. Look at all aspects of the system under test and determine how to most efficiently use each component. This will get you started towards creating and maintaining Olympian applications.
Observe and Monitor
Have you noticed all of the things athletes have to monitor? Speed, heart rate, body fat ratio, etc? Why do they do this? To determine their health. What about all the statistics that come out of the Olympic games? Every kind of metric possible about a sport is tracked and available online within minutes. Why? To monitor and determine the performance of the athletes. This reveals how it impacts the sport overall.
Applications need to be monitored to determine health and performance. End user experience can be timed with synthetic processes that run every few minutes and report back page times. Real User Monitoring can time actual live sessions across a wide range of users to determine if there are performance issues with specific user types or geographic regions. Infrastructure monitoring can tell if any system resources are running low, being throttled, or throwing errors. Logs can be parsed to see the response times of individual requests.
Code can be profiled to see how heavy each function is, and databases can be profiled to see how expensive each request for data is. The network can be monitored to determine if there is enough bandwidth to service the number of requests. Monitoring of modem applications built on microservices and containers need additional telemetry from hop to hop. The path from the server to the user might be different each time a request is made, and the container that was used an hour ago may not even exist because the application is elastic.
In all of these cases, what gets measured is important to finding root causes of performance issues, and the health of the overall system. Just producing load on a system with a performance testing tool without monitoring is not a valuable exercise. You might determine which user timings are high, but you won’t know why.
The value in any performance testing tool is the ability to provide meaningful results to the business, and these come from graphs that include key metrics. Those metrics are from the monitoring being utilized. You can over-monitor to the point of affecting the performance of the system itself, so it is important to find the right balance. Even then, there is a lot of data and it may require special expertise to make sense of it all.
Some products actually use AI and machine learning to help with this, but human expertise is still a requirement. What are you looking at to determine the health and performance of your applications?
Rinse Your Cottage Cheese
In the book “Good to Great” by Jim Collins, he tells the story of Dave Scott – who won six Ironman competitions in a row. When asked about his secrets to success, he attributed it to his attention to detail. Dave was willing to go so far as to wash his cottage cheese to remove any extra fat before eating it. Athletes competing at the highest level don’t consider this strange at all. Whatever it takes. Think of the special clothing the athletes wear to reduce friction in movement. Paying attention to the minutia sometimes means the difference between winning and losing.
Modern applications can be complex, and require attention to detail. Performance testing 20% of the application at the very last moment before deployment no longer works to effectively reduce risk of a production deployment.
Simple monitoring of a few key metrics may not help in finding application bottlenecks before your customers do. Performance has to be engineered into products from the start, and performance needs to be a continuous, not a specific, event. Observability tools over traditional APM tools might be required to pinpoint exact points of failure. It may require the use of AIOps to quickly go through all the detailed data.
This is easier said than done and requires a journey and attention to detail to get a well-oiled machine running. How far are you willing to go to get performance feedback at the earliest possible stage? It may take going to great lengths to squeeze the last 10% from the application. When the cost of a customer switching is low (meaning they can easily go somewhere else with the click of a button), attention to detail may be the difference in retaining or losing that customer.
See the End From the Beginning
Many athletes see themselves winning before they ever do. As a result of repetitive and relentless training, they have relived the steps of the event in their mind over and over until they see themselves at the finish line with a medal before they ever get to the Olympic field. Some of the biggest winners of gold medals saw the end result in their mind every time they considered giving up during the training. This is an important exercise to reinforce their self-determination. They have to see the end from the beginning.
We have all been on the other side of a conversation where someone was using a computer to process a transaction for us and heard the words, “I’m sorry, the computers are slow today.” The impression of the end user is that the application or the system is of poor quality. They don’t care that all of the code runs fast individually, or that the database returns the data in milliseconds. They don’t care how much money was spent on servers. If they see the spinning wheel or the screen taking too long to return them the information they need – the end user experience will be poor. When all stakeholders involved in the software development process understand that every piece of the puzzle matters and every area can affect the application performance for better or worse, it helps motivate them to optimize at every stage. Keeping the end user in mind even in the earliest stages of software development goes a long way in making sure the experience is the best it can be. We all have to see the end from the beginning.
Go For the Gold, Not For the Bronze
In order to win you have to play to win. Olympic athletes train for years for their moment when everything is on the line and everything in that moment of competition means everything. They did not train and sacrifice all of those hours, months, and years, to be third place.
Companies who want to win the performance race have people of passion who live, eat, and breathe the discipline of performance testing. The performance engineer never believes “good enough” is good enough, even if the business does. They are never done analyzing, testing, or monitoring. There is always another optimization to be done. They are always learning, and performance becomes a continuous process.
While keeping the lights on is imperative for the success of any business, it won’t thrive without innovation. This means listening to customers and reducing as much of their toil as possible. It means thinking outside the box to come up with new ways to accomplish old tasks.
Companies like Netflix aren’t in the performance testing or engineering business, yet they have provided the performance community with a wealth of new areas of interest to explore (chaos engineering, flame graphs, etc) because they were allowed to innovate. And by any standard, the company has done pretty well. Are you taking every opportunity to innovate and do things differently in order for your applications to be gold worthy?