You wouldn’t code a website from scratch, so why would you code your corporate data estate from scratch? As it turns out, data engineers and business intelligence professionals can learn a great deal from the history of website development to help us with managing our corporate data estate.
In the early days of the Internet, websites were developed from scratch using coding languages like HTML and CSS. This was a complex process that often required a team of web designers and developers, a complex stack of tools, and months of development time.
There were actually a number of advantages to building a website this way:
- Design Precision: It allowed for complete control over the design of the website.
- Customization: It meant that the website could be customized to a very specific set of requirements.
- Hosting & Storage: It gave the website owner complete control over the hosting, storage, and migration of the website.
But on the flipside, there were several drawbacks of developing a website from scratch:
- Expensive: It was very time-consuming and expensive.
- Highly-technical: It required a lot of technical knowledge and skills (which are hard to staff for, especially with the talent shortages organizations continue to struggle with).
- Error-prone: It is prone to errors and coding mistakes that can be very difficult to track down and fix.
The Shift to Low-code Website Development
As more people began using the Internet, better tools and resources became available. Today, the market is full of low-code Content Management Systems (CMS) and drag-and-drop website builders (WordPress, HubSpot, Shopify, Squarespace, etc.) that make it easy to create a professional-looking website without any coding knowledge.
While there are still a handful of very specific use cases where you would need to code a website from scratch, organizations realized that using a low-code CMS or drag-and-drop builder was a much better option in the vast majority of cases.
This shift has led to a dramatic decrease in the amount of time and effort required to build a website. In fact, you can now create an entire website in just a few hours using these low-code tools.
With every great shift comes some level of resistance. At first, web developers were skeptical of (or outright opposed to) low-code tools for the following reasons:
- Fear of Replacement: Developers saw these tools as a threat to their jobs.
- Power & Flexibility: Developers were unconvinced that they would be powerful, flexible, or customizable enough to produce the same quality of work.
- Trust & Reliability: Since the code was generated automatically, developers were unsure that it could be trusted to the same degree as the code they wrote themselves.
But over time, most developers came to realize that these fears were unfounded and that these tools were not a replacement for their skills but rather a complement.
The benefits became abundantly clear:
- Automated: Developers can automate all the repetitive, tedious tasks so they can focus on the more complex and rewarding aspects of website development.
- Combined experience: The algorithms that power the automatic code generation contain lifetimes of combined developer experience and best practices.
- Fewer errors: The code that is automatically generated by these tools is not only reliable, it is almost always less prone to error than code that is written manually by a human.
- Agile: By trusting the code that others have created, you can go from idea, to mock-up, to publishing in a fraction of the time.
- Adaptable & Future-proof: The web evolves at an incredible speed, and the code automatically adapts with it, so you don’t have to worry about constantly updating and maintaining the underlying infrastructure.
At the very least, a low-code website builder can help you get the first 80% of the website development work done quickly, so you can then spend the final 20% on customizing and tweaking things manually to meet your organization’s unique needs and specifications.
As you’ve likely identified already, the parallels between the worlds of website development and data management are abundant.
The Dark Ages of Data Management: Complex Tool Stacks/Manual Coding
Like website development, the data preparation process has traditionally relied on a highly-complex stack of tools, a growing list of data sources and systems, and months spent hand-coding each piece together to form fragile data “pipelines.”
There are several problems with this approach:
- Manual coding & pipeline creation: New pipelines must be manually built for each data source, data store, and use case (for example, analytics reports) in the organization, which often results in the creation of a massive network of fragile pipelines.
- Stacks on stacks of tools: To make work even more complicated, there is often a separate stack of tools for managing each stage of the pipeline, which multiplies the number of tools in use and creates additional silos of knowledge and specialization.
- Vulnerable, rigid infrastructure: Building and maintaining these complex data infrastructures and pipelines is not only costly and time-consuming, it also introduces ongoing security vulnerabilities and governance issues, and makes it extremely difficult to adopt new technologies in the future.
- Fragile pipelines: Even worse, these data pipelines are hard to build, but very easy to break. More complexity means a higher chance that unexpected bugs and errors will disrupt processes, corrupt data, and fracture the entire pipeline.
- Manual documentation and debugging: Each time an error occurs, data engineers must take the time to go through the data lineage and track down the error. This is extremely difficult if the metadata documentation is incomplete or missing (which it often is).
It’s no wonder that according to a story by Matt Asay in TechRepublic on Nov. 10, 2017, and referencing Gartner analyst Nick Heudecker, that 85% of these projects fail.
We know how slow, painful, and expensive this approach is from years of first-hand experience as IT consultants. We struggled through all these same issues when helping our clients build their data infrastructures. We refer to this as the “dark ages” of data management due to its reliance on manual processes, brittle infrastructure, and high rates of failure.
Thankfully, there is a new way to build your data infrastructure that is significantly more efficient, resilient, and scalable.
The Rise of Low-Code Data Management
Low-code data management tools now enable data engineers to build data infrastructure and pipelines very quickly using a drag-and-drop interface. Consider the following point…
“By 2025, 70% of new applications developed by organizations will use low-code or no-code technologies, up from less than 25% in 2020,” according to a Nov. 10, 2021 press release issued by Gartner.
When low-code data management tools first started to become popular, there was a lot of resistance and skepticism among data and analytics professionals.
Their concerns were very similar to the concerns web developers had:
- Fear of Replacement: They felt that using low-code tools would make them less valuable and marketable (since their jobs would be easier to automate).
- Power & Flexibility: Data engineers felt that coding data pipelines from scratch gave them more power, control, and flexibility.
- Trust & Reliability: They were worried that low-code tools would be less reliable and wouldn’t be able to handle the scale or complexity of data engineering workloads.
However, as organizations have started using these tools, it’s become clear that these fears are unfounded, just as they were in the world of website development.
And all the same benefits exist:
- Automated: Data teams can now automate repetitive, tedious tasks, and focus on the more complex and rewarding aspects of their work, such as data modeling and data analysis.
- Combined experience: The algorithms that power the automatic code generation contain lifetimes of combined developer experience and best practices.
- Fewer errors: The code that is automatically generated by these tools is not only reliable, it is almost always less prone to error than code that is written manually by a human.
- Agile: By trusting the code that others have created, you can go from development, to testing, to production in a fraction of the time.
- Adaptable & Future-proof: Data storage technology evolves at an incredible speed, and the code automatically adapts with it, so you don’t have to worry about constantly updating and maintaining the underlying infrastructure.
At the very least, these low-code data management tools can help you get the first 80% of the data engineering work done quickly, so you can then spend the final 20% on customizing and tweaking things manually to meet your organization’s unique needs and specifications.
Unfortunately, not all low-code data management tools are built the same.
Platform vs. Builder
The data management market is now full of “platforms” that promise to reduce complexity by combining all your data storage, ingestion, preparation, and analysis tools into a single, unified, end-to-end solution.
While this might sound ideal, these claims start to fall apart upon closer inspection:
- Stacks in disguise: Most “platforms” are actually just a stack of separate tools for building and managing each component of the data estate that have been bundled together for one price.
- Stitched together like Frankenstein’s monster: Yes, all the tools are sold by the same vendor, but they’re often collected through acquisitions, and it ends up just being a big, ugly mess of incompatible code that has been haphazardly stitched together into a “platform.”
- Low-code: Many of these platforms brag about being “low-code,” but when you dig into the details, there’s usually only 1 or 2 features that actually have this functionality.
- Welcome to data management prison: Worst of all, you end up being locked into a proprietary ecosystem that won’t allow you to truly own, store, or control your own data. All tools and processes are pre-defined by the platform developer, and then hidden in a “black box” that you can’t access or modify. Many of these platforms even force you to migrate all your data to the cloud, and do not offer support for on-premises or hybrid approaches.
- Trying to escape might cost you everything: Not only do these platforms significantly limit your data management options, if you decide to migrate to a different data platform later, you must rebuild nearly everything from the ground up.
- These solutions are not truly “platforms,” they have very limited “low-code” functionality, and they don’t really “unify” anything. They’re just tool stacks with better branding and a lot more restrictions.
Data professionals are in desperate need of a faster, smarter, more flexible way to build and manage their data estates. In essence, they need the same solution web developers have relied on for years; a drag-and-drop builder of their own.
Fortunately, these unified, drag-and-drop solutions are now available. There are solutions that now offer low code to build a data estate and eliminate coding and complex tool stacks. Moving towards this type of solution provides bottom-line opportunities for data teams from substantially lowering build costs to lowering maintenance costs. They also free up data teams from performing manual, repetitive tasks and allow them to shift over to higher-impact projects.
Today, an organization’s website is a critical asset. You don’t want it going down for reasons shrouded in mystery or slugging along in turtle speed due to poorly-coded customizations. The same could be said about your corporate data estate. The good news is we have history to look back on to help us make the right decision moving forward.