Incident management might finally be coming into its own, with the announcement today by startup Kintaba that it is releasing a feature to help organizations automate their decision-making processes during security or performance incidents, or outages.
Traditional incident management required organizations to cobble together a set of tools for responding to issues. Often, that would require a performance monitoring tool, spreadsheets, then perhaps text messages, online chat, and even telephonic beepers and pagers. Today, Kintaba is releasing Automations, its new feature that automates the response process to ensure “the right people will be included at the right time to help resolve the incident,” according to John Egan, CEO and co-founder of Kintaba.
“How do we make sure everyone comes together in a centralized location? How do we make sure we get the right individuals responding, and collaborating?” Egan explained in an interview with SD Times. ” How do we put that response team together, that subscriber team together, record that timeline, make sure we go write the post-mortem, and once it’s done, distribute that post-mortem to everyone who’s involved and make sure a followup review happens, these very understood steps that need to be followed, in the case of what we call these ‘black swan’ events, these major outages and critical incidents that need to be followed inside of organizations. And that’s the incident management layer.”
Users of Kintaba Automations, according to the company announcement, can define a rules-based decision tree for engaging the right responders and subscribers with active incidents. An example the company cites is an incident classified as impacting someone’s personally identifiable information, during which Automations can be set up to add the person on-call for the legal team at the time of the incident, and the customer success manager as a responder for the affected individual, and the CIO as a subscriber. Kintaba noted in the announcement that the automations can be triggered both at the incident creation, and as the incident progresses, so the response team evolves organically as more is learned out the incident and its mitigation.
Egan and the leadership team spent time at Facebook, building out information tools around incident management, task management and collaboration. What they saw at Facebook was a commitment to involve every stakeholder in the incident to ensure the right people were answering to the right people as the incident was identified and work was done to resolve it.
“Facebook, Google, these sort of mature organizations practicing incident management, all follow kind of a set of themes that are important. And one of those themes is sort of a consistency and process that has been implemented by a tool,” Egan said. “So each of these organizations has a tool internally, that everyone knows they can go to, to get updates and information on the individual incident.
“So when something’s happening, let’s say major egress dropper or site outage, you really want your sales guys, your PR guys, your customer success folks, to be able to come in and see what’s happening without having to go and cause communication thrash back to the management.”
Kintaba offers that communication platform, which allows response team interaction and records the timeline of events. That recording can be referred back to during a post-mortem, Egan explained.
To hear more of the conversation on incident types, how response is measured and what successful organizations do to tackle the problem, listen in on today’s “What the Dev” podcast with editor-in-chief David Rubinstein and John Egan.