After watching application teams, security teams and operations teams get the -Ops treatment, data engineering teams are now getting their own process ending in -Ops.
While still in its very early days, data engineers are beginning to embrace DataOps practices.
Gartner defines DataOps as “a collaborative data manager practice, really focused on improving communication, integration, and automation of data flow between managers and consumers of data within an organization,” explained Nick Heudecker, an analyst at Gartner and lead author of Gartner’s Innovation Insight piece on DataOps.
DataOps is first and foremost a people-driven practice, rather than a technology-oriented one. “You cannot buy your way into DataOps,” Heudecker said.
Michele Goetz, a principal analyst at the research firm Forrester, explained that DataOps is like the DevOps version of anything to do with data engineering. “Anything that requires somebody with data expertise from a technical perspective falls into this DataOps category,” she said. “Why we say it’s like a facet of DevOps is because it operates under the same model as continuous development using agile methods, just like DevOps does.”
DataOps aims to eliminate some of the problems caused by miscommunications between developers and stakeholders. Often, when someone in an organization makes a request for a new data set or new report, there is a lack of communication between the person requesting and whoever will follow through on that request. For example, someone may make a request, an engineer will deliver what they believe is what is needed, and when the requesters receives it they are disappointed that it’s not what they asked for, Heudecker explained. This can result in increased frustration and missed deadlines, he explained.
By getting stakeholders involved throughout the process, some of those headaches may be avoided. “[CIOs] really want to figure out how do they get less friction in their companies around data, which everybody’s asking for today,” said Heudecker.
Another potential benefit of DataOps is improved data utilization, Heudecker explained. According to Heudecker, these are some of the questions that organizations may start to ask themselves:
- “Can I use the data that’s coming into my organization faster?
- Are things less brittle?
- Can things be more reliable?
- Can I react to changes in data schemas faster?
- Is there a better understanding of what data represents and what data means?
- Can I get faster time to market for the data assets I have?
- Can I govern things more adequately within my company because there’s a better understanding of what that data actually represents?”
According to Goetz, for companies that have been journeying down the path of “tightening the bolts” of what is needed from a data perspective and how that supports digital and other advanced analytics strategies, it is clear that they need an operating model that allows development around data to fit into their existing solution development track. This enables them to have data experts on the same team as the rest of the DevOps Scrum teams, she explained.
Organizations that are less mature in their data operations tend to still think in terms of executing on data from a data architecture perspective. In addition, a lot of those less mature companies do not handle data in-house, but will outsource it to systems integrators and will take a project-oriented waterfall approach, Goetz explained.
The companies that are already getting DataOps right are typically going to be the ones that already have a DevOps practice in place for their solution development, whether it’s on the application or automation side, Goetz explained. Those more advanced companies also tend to have a model for portfolio management and business architecture that aligns to continuous development. “They’re recognizing there is an opportunity to better fit into the way that you operate around development with those teams so that data doesn’t get left behind and isn’t building up technical debt,” she said.
According to Goetz, this doesn’t just apply to data systems; it encompasses data governance, which traditionally has been the “final bastion of anything anyone wanted to do with the data. It was always playing cleanup,” she said.
“It’s really fascinating to see how organizations act when the lightbulb goes off and they make the equivalency between DataOps and DevOps,” said Goetz. “It’s like all those barriers start to fall away because they typically have something that’s been in place that they’re able to now fit into instead of fight against.”
Having a DevOps structure in place can ensure DataOps success
According to Goetz, companies that have not at least gone through or adopted some Agile methodologies will have a hard time adopting DataOps.
Goetz explained that over the years, she has seen companies evolve and try to switch from waterfall to Agile. They tend to struggle and make mistakes along the way, at least at first. Unless a company has some of those competencies, they will likely struggle. “So I think there’s definitely some foundations that make it easier to get started in one end of the company,” said Goetz.
DataOps is probably here to stay, though it will be a while before it is widely adopted
DataOps is still in the very early stages, so it’s hard to predict where it will go in the future, or even if it will reach wide adoption or fizzle out, Heudecker explained. However, even if DataOps isn’t here to stay, it will still have some positive lasting effects, Heudecker said. “If it gets companies thinking differently about how they collaborate around data, that’s a good thing,” said Heudecker. “Even if it is a short-term hype and then it kind of fizzles out after a while, companies internalize some of the principles or ideas around the topic, and that’s good.”
Goetz doesn’t see DataOps going away anytime soon. In fact, she said that it is actually accelerating in terms of interest and adoption. The level of interest will vary from company to company, but the groundswell is definitely there, she explained.
In fact, a 2018 survey from data company Nexla and research firm Pulse Q&A revealed that 73 percent of organizations were investing in DataOps last year.
The reason she doesn’t see it going away is that one of the catalysts for DataOps is that organizations are recognizing that they don’t just need to build technical capabilities and install applications anymore. In today’s world, organizations are building their own digital foundations, products, and digital business. According to Goetz, those things require a different way of development and going to market.
“[Those companies] looked at where DevOps came from,” Goetz said. “It came from the product companies, particularly the technology product companies. And they have been successful. And you also see integrators redesigning their development practices around DevOps. So there’s just so much momentum behind it. And there’s better results coming out of these practices in general that I don’t see it going away.”
It may be too early to make any predictions around DataOps
Even though it’s still too early to start seeing any obvious trends, Heudecker has still seen a lot of interest in the topic. Right now it is very vendor-led, he said, but there has been a lot of interest from organizations, too. In particular, companies are interested in learning exactly what it is and whether or not it will benefit them.
Going forward, it will probably be the organizations themselves, not vendors, who will define the best practices, Heudecker explained.
Organizations trying DataOps out are going to be “leading on what those best practices are and how you create a center of excellence around that,” said Goetz.
One trend that Goetz has already seen is that companies are approaching DataOps from the AI side of things. Algorithms have advanced and a lot of the existing AI models have gotten quite good at classifying, categorizing, and doing other data preparation work. And data scientists have gotten good at finding analytics functions and machine learning to execute on their data. They don’t even necessarily have to be data scientists because they don’t have to manipulate the model to optimize it. Things are a bit more premade, and vendor tooling is enabling the citizen data scientist. “You don’t always need to have data science skills to take advantage of a data science model or machine learning model,” Goetz explained.
Another trend she has seen is that the role of architects will likely change in DataOps structures. Architects have historically been ignored because developers don’t want someone telling them what to develop; they just want to sit down and make it. Often, architects are seen as something that will slow teams down and push them into more of a waterfall structure.
But according to Goetz, in stronger Agile practices, architecture actually plays a significant role because it helps define the vision and patterns.
The role of data governance
Many of the regulations that are popping up around governance, such as Europe’s General Data Protection Regulation (GDPR) and California’s Consumer Privacy Act, make handling information and governing it mandatory requirements for what you are going to develop, Goetz explained.
As a result of these new regulations, we are going to start see that privacy and security from a governance perspective aren’t just going to be handled at the CISO level or in data governance teams. These regulations are causing there to be a stronger working relationship between those stewardship teams and data engineering teams, she said.
“It is required to infuse governance capabilities into every aspect of data development or data design,” said Goetz. “That can’t be lost… there’s a symbiotic relationship that is developing, in DataOps specifically, where what you do from a data management and architecture perspective, what you do from a delivery perspective, and what you do for a governance perspective, those are no longer three different silos. It is one single organization, and if there’s only one benefit to going down the route of adopting DataOps, it is that you have a better operating model for data in general, regardless. You will build a better data lake. You will build better pipelines. You will build more secure environments. You will tune your data to business needs better, just by that symbiotic relationship. And I think that that’s the accelerator to not failing in your digital capabilities when data is at the core.”
The DataOps Manifesto
Though it is still in its early days, DataOps already has its own manifesto, similar to the Agile Manifesto.
The DataOps Manifesto places value in:
- “Individuals and interactions over processes and tools
- Working analytics over comprehensive documentation
- Customer collaboration over contract negotiation
- Experimentation, iteration, and feedback over extensive upfront design
- Cross-functional ownership of operations over siloed responsibilities”
Other principles of DataOps that it lists include continually satisfying customers, valuing working analytics, embracing changing, having daily interactions, self-organizing, and more.