Datadog today is revealing its vision for bringing security and performance monitoring into a single platform in the form of updates and new product features for its cloud infrastructure monitoring platform.
At its virtual DASH conference this week, the company announced Error Tracking, Incident Management, Compliance Monitoring and Continuous Profiler, rounding out its platform to make it easier for developers to find deep performance issues with their applications. For operations teams, the new Incident Management product enables debugging and issue resolution, and for security and compliance teams, full visibility into cloud environments gives them a means to ensure misconfigurations don’t create problems.
These products join the company’s already existing infrastructure monitoring, APM and log management capabilities in the Datadog platform.
“In our opinion, security and observability are both coming together in modern applications. What used to be siloed security teams, and development teams and operations teams, in modern web-based applications they’re all starting to come together,” Amit Agarwal, chief product officer at Datadog, said in a briefing on the announcements. “Applications have become Agile; you make changes to it every day. So they need to be in lockstep and sync to solve many of the problems. What we are offering to our customers is a single platform to do both monitoring and security, because it’s all based on the same data… the same logs are used in one context by developers and operations people, to see why performance is poor, and the same ones are used by security people to see, well, maybe the performance is bad because someone is doing a denial of service attack.”
The Error Tracking tool, which becomes available today, focuses on how errors are affecting the customer experience, and aggregates all the errors that might be occurring across all of the application’s users into a small list of issues that represent the specific bugs users are encountering. “This provides us a better overview of the health of the application, rather than a firehose of data,” said Ilan Rabinovitch, vice president of product and community at Datadog. “Developers take advantage of our RUM product, APM and logging. Logs and APM let them get a good sense of what the experience looks like server-side, and our real user monitoring product admits telemetry from the user side, either web or mobile traffic, to see how it’s performing on the actual users’ computers. By combining the three, we get a pretty good picture of the customers’ experience.”
The Continuous Profiler, like traditional profilers, measures the performance of an application and gives visibility down to the line of code where the problem exists. “When deploying code, every application developer has these three questions in mind,” explained Renaud Boutet, vice president of product at Datadog. “Am I delivering a fast user experience? Am I over-consuming resources? And, probably more stressful, am I going to create an incident in production? Historically, people have been using profiling solutions to mediate and solve these problems… however, legacy profiling tools have such a performance overhead that they are used almost exclusively at the development stage. Meanwhile the production environment, which represents the real world and all the unexpected behaviors, is actually not covered.”
According to the company’s announcement, “Datadog Continuous Profiler closes this visibility gap with minimal resource-overhead that allows for always-on profiling. Having constant visibility into code performance allows developers to more effectively identify hidden performance bottlenecks.”
On the incident management side, Datadog’s new product understands that as much as the practice involves a technical response, it’s also very much a human one. “It’s not just a question of finding that line of code … but there’s also a lot of time spent assembling your team, deciding who needs to be on that team, what resources they need at their fingertips, and what data you want to give them to convince them of an incident,” Rabinovitch said. “So time to detection and resolution of an incident is just as much about getting your team coordinated as it about those technical responses.”
The Incident Management product brings together a set of tools that let you launch an investigation with your team and pull in all the people you need, it helps you create a timeline of all the actions your team has taken, and to collect all those signals and share those with your teams on various collaboration platforms, Rabinovitch said.
To support Incident Management workflow, the company announced that an Android and iOS application for interacting with Datadog monitors and dashboards on the go is now generally available. Also, a ChatBot that integrates with Slack enables access to Datadog data, and improvements to Datadog Notebooks allows for real-time collaboration and feeds directly into postmortems.
On the security side, Datadog is releasing its new Compliance Monitoring product into beta today. “Security has always been a priority, moreso now than ever, as businesses move online, and devs and ops teams are moving faster,” Boutet said. The compliance tool, according to the company announcement, “tracks the state of all cloud-native resources, such as security groups, storage buckets, load balancers, and Kubernetes.”
Among the key features are security observability that enables users to discover assets and their configurations and combine it with Datadog’s full telemetry, a compliance status snapshot, file integrity monitoring, continuous configuration assessment, and a simple WYSIWYG interface for creating custom security and governance policies.
A big part of the problem organizations are looking to overcome is that developers aren’t trained well in security, and security teams don’t have a solid understanding of the software development lifecycle.
“What used to be siloed security teams, and development teams and operations teams, in modern web-based applications, they’re all starting to come together,” said Agarwal. “Applications have become agile; you make changes to it every day. So they need to be in lockstep and sync to solve many of the problems.”