When changing lanes on the highway, one of the most important things for drivers to remember is to always check their blind spot. Failing to do this could lead to an unforeseen, and ultimately avoidable, accident.
The same is true for development teams in an organization. Failing to provide developers with insight into their tools and processes could lead to unaddressed bugs and even system failures in the future.
This is why the importance of providing developers with ample observability cannot be overstated. Without it, the job of the developer becomes one big blind spot.
Why is it important?
“One of the important things that observability enables is the ability to see how your systems behave,” said Josep Prat, open-source engineering director at data infrastructure company Aiven. “So, developers build features which belong to a production system, and then observability gives them the means to see what is going on within that production system.”
He went on to say that developer observability tools don’t just function to inform the developer when something is wrong; rather, they dig even deeper to help determine the root cause of why that thing has gone wrong.
David Caruana, UK-based software architect at content services company Hyland, stressed that these deep insights are especially important in the context of DevOps.
“That feedback is essential for continuous improvement,” Caruana said. “As you go around that loop, feedback from observability feeds into the next development iteration… So, observability really gives teams the tools to increase the quality of service for customers.”
The in-depth insights it provides are what sets observability apart from monitoring or visibility, which tend to address what is going wrong on a more surface level.
According to Prat, visibility tools alone are not enough for development teams to address flaws with the speed and efficiency that is required today.
The deeper insights that observability brings to the table need to work in conjunction with visibility and monitoring tools.
With this, developers gain the most comprehensive view into their tools and processes.
“It’s more about connecting data as well,” Prat explained. “So, if you look at monitoring or visibility, it’s a collection of data. We can see these things and we can understand what happened, which is good, but observability gives us the connection between all of these pieces that are collected. Then we can try to make a story and try to find out what was going on in the system when something happened.”
John Bristowe, community director at deployment automation company Octopus Deploy, expanded on this, explaining that observability empowers development teams to make the best decisions possible going forward.
These decisions affect things such as increasing reliability and fixing bugs, leading to major performance enhancements.
“And developers know this… There are a lot of moving parts and pieces and it is kind of akin to ‘The Wizard of Oz’ … ‘ignore the man behind the curtain,’” Bristowe said. “When you pull back that curtain, you’re seeing the Wizard of Oz and that is really what observability gives you.”
According to Vishnu Vasudevan, head of product at the continuous orchestration company Opsera, developer interest in observability is still somewhat new.
He explained that in the last five years, as DevOps has become the standard for organizations, developer interest in observability has grown exponentially.
“Developers used to think that they can push products into the market without actually learning about anything around security or quality because they were focusing only on development,” Vasudevan said. “But without observability… the code might go well at first but sometime down the line it can break and it is going to be very difficult for development teams to fix the issue.”
The move to cloud native
In recent years, the transition to cloud native has shaken up the software development industry. Caruana said that he believes the move into the cloud has been a major driver for observability.
He explained that with the complexity that cloud native introduces, gaining deep insights into the developer processes and tooling is more essential than ever before.
“If you have development teams that are looking to move towards cloud-native architectures, I think that observability needs to be a core part of that conversation,” Caruana said. “It’s all about getting that data, and if you want to make decisions… having the data to drive those decisions is really valuable.”
According to Prat, this shift to cloud native has also led to observability tools becoming more dynamic.
“When we had our own data centers, we knew we had machines A,B,C, and D; we knew that we needed to connect to certain boxes; and we knew exactly how many machines were running at each point in time,” he said. “But, when we go to the cloud, suddenly systems are completely dynamic and the number of servers that we are running depends on the load that the system is having.”
Prat explained that because of this, it is no longer enough to just know which boxes to connect; teams now have to have a full understanding of which machines are entering into and leaving the system so that connections can be made and the development team can determine what is going on.
Bristowe also explained that while the shift to cloud native can be a positive thing for the observability space, it has also made it more complicated.
“Cloud native is just a more complex scenario to support,” he said. “You have disparate systems and different technologies and different ways in which you’ll do things like logging, tracing, metrics, and things of that sort.”
Because of this, Bristowe emphasized the importance of integrating proper tooling and processes in order to work around any added complexities.
Prat believes that the transition to cloud native not only brings new complexities, but a new level of dynamism to the observability space.
“Before it was all static and now it is all dynamic because the cloud is dynamic. Machines come, machines go, services are up, services are down and it is just a completely different story,” he said.
Opsera’s Vasudevan also stressed that moving into the cloud has put more of an emphasis on the security benefits that observability can offer.
He explained that while moving into the cloud has helped the velocity of deployments, it has added a plethora of possible security vulnerabilities.
“And this is where that shift happened and developers really started to understand that they do need to have this observability in place to understand what the bottlenecks and the inefficiencies are that the development team will face,” he said.
The risks of insufficient observability
When companies fail to provide their development teams with high level observability, Prat said it can feel like regressing to the dark ages.
He explained that without observability, the best developers can do is venture a guess as to why things are behaving the way that they are.
“We would need to play a lot of guessing games and do a lot more trial and error to try and reproduce mistakes… this leads to countless hours and trying to understand what the root cause was,” said Prat.
This, of course, reduces an organization’s ability to remain competitive, something that companies cannot afford to risk.
He emphasized that while investing in observability is not some kind of magic cure-all for bugs and system failures, it can certainly help in remediation as well as prevention.
Bristowe went on to explain that observability is really all about the DevOps aspect of investing in people, processes, and tools alike.
He said that while there are some really helpful tools available in the observability space, making sure the developers are onboard to learn with these tools and integrate them properly into their processes is really the key element to successful observability.
Observability and productivity
Prat also emphasized that investing in observability heavily correlates to more productivity in an organization. This is because it enables developers to feel more secure in the products they are building.
He said that this sense of security also helps when applying user feedback and implementing new features per customer requests, leading to heightened productivity as well as strengthening the organization’s relationship with its customer base.
With proper observability tools, a company will be able to deliver better features more quickly as well as constantly work to improve the resiliency of its systems. Ultimately, this provides end users with a better overall experience as well as boosts speeds.
“The productivity will improve because we can develop features faster, because we can know better when things break, and we can fix the things that break much faster because we know exactly why things are being broken,” Prat said.
Vasudevan explained that when code is pushed to production without developers truly understanding it, technical debt and bottlenecks are pretty much a guarantee, resulting in a poorer customer experience.
“If you don’t have the observability, you will not be able to identify the bottlenecks, you will not be able to identify the inefficiencies, and the code quality is going to be very poor when it goes into production,” he said.
Bristowe also explained that there are times when applications are deployed into production and yield unplanned results. Without observability, the development team may not even notice this until damage has already been caused.
“The time to fix bugs, time to resolution, and things like that are critical success factors and you want to fix those problems before they are discovered in production,” Bristowe said. “Let’s face it, there is no software that’s perfect, but having observability will help you quickly discover bottlenecks, inefficiencies, bugs, or whatever it may be, and being able to gain insight into that quickly is going to help with productivity for sure.”
Aiven’s Prat noted that observability also enables developers to see where and when they are spending most of their time so that they can tweak certain processes to make them more efficient.
When working on a project, developers strive for immediate results. Observability helps them when it comes to understanding why certain processes are not operating as quickly as desired.
“So, if we are spending more time on a certain request, we can try and find why,” Prat explained. “It turns out there was a query on the database or that it was a system that was going rogue or a machine that needed to be decommissioned and wasn’t, and that is what observability can help us with.”
Automation and observability
Bristowe emphasized the impact that AI and automation can have on the observability space.
He explained that tools such as ChatGPT have really brought strong AI models into the mainstream and showcased the power that this technology holds.
He believes this same power can be brought to observability tools.
“Even if you are gathering as much information as possible, and you are reporting on it, and doing all these things, sometimes even those observations still aren’t evident or apparent,” he said. “But an AI model that is trained on your dataset, can look and see that there is something going on that you may not realize.”
Caruana added that AI can help developers better understand what the natural health of a system is, as well as quickly alert teams when there is an anomaly.
He predicts that in the future we will start to see automation play a much bigger role in observability tools, such as filtering through alerts to select the key, root cause alerts that the developer should focus on.
“I think going forward, AI will actually be able to assist in the resolution of those issues as well,” Caruana said. “Even today, it is possible to fix things and to resolve issues automatically, but with AI, I think resolution will become much smarter and much more efficient.”
Both Bristowe and Caruana agreed that AI observability tools will yield wholly positive results for both development teams and the organization in general.
Bristowe explained that this is because the more tooling brought in and the more insights offered to developers, the better off organizations will be.
However, Vishnu Vasudevan, head of product at the continuous orchestration company Opsera, had a slightly different take.
He said that bringing automation into the observability space may end up costing organizations more than they would gain.
Because of this risk, he stressed that organizations would need to be sure to implement the right automation tools so that teams can gain the actionable intelligence and the predictive insights that they actually need.
“I would say that having a secure software supply chain is the first thing and then having observability as that second layer and then the AI and automation can come in,” Vasudevan said. “If you try to build AI into your systems and you do not have those first two things, it may not add any value to the customer.”
How to approach observability
When it comes to making sure developers are provided with the highest level of observability possible, Prat has one piece of advice: utilize open-source tooling.
He explained that with tools like these, developers are able to connect several different solutions rather than feeling boxed into one single tool. This ensures that they are able to have the most well-rounded and comprehensive approach to observability.
“You can use several tools and they can probably play well together, and if they are not then you can always try and build a connection between them to try and help to close the gap between two tools so that they can talk to each other and share data and you can get more eyes looking at your problem,” Prat said.
Caruana also explained the importance of implementing observability with room for evolution.
He said that starting small and building observability out based on feedback from developers is the best way to be sure teams are being provided with the deepest insights possible.
“As you do with all agile processes, iteration is really key, so start small, implement something, get that feedback, and make adjustments as you go along,” Caruana said. “I think a big bang approach is a high risk approach, so I choose to evolve, and iterate, and see where it leads.”