Over a decade ago, Internet and tech entrepreneur Marc Andreesen penned a prescient article for The Wall Street Journal, Why Software is Eating the World. His thesis, that software allowed for vast cost reductions in businesses, and enabled the complete disruption of once-staid markets, has been proven many times over. But while Mr. Andreesen’s observation was focused on what the ubiquity of software meant for market winners and losers, there is a correlated impact on the very nature of software development. Specifically, the universality of software across businesses, and the entire technology landscape, has resulted in a level of complexity never before seen by humans.

Software is, literally, everywhere. From the IoT microcontrollers of intelligent lightbulbs to vast massively parallel supercomputers, virtually every aspect of how the world operates depends on software. When software-driven systems work, they make our lives easier, less expensive and, arguably, more fulfilling. But in order to realize these outcomes, software has become immense in scale, both in terms of breadth and depth. It has been abstracted, componentized, distributed, and integrated with a myriad array of patterns that distribute chunks of software and data stores in a vast inter-dependent fashion. Higher-level languages, domain orientation, ubiquitous code reuse, abstraction techniques, and even no-code development platforms, may obfuscate much of the complexity from the developer’s immediate view. However, the complexity isn’t removed, it’s just hidden. No matter how high-level a programming language, that code eventually results in processor-level instruction execution.

In addition to the underlying complexity of software, the scale of the global software ecosystem has resulted in a high degree of specialization among developers. Frontend, middleware, backend, web, embedded, mobile, database, OS, and security are just a few of the specialization areas in modern software engineering. Coupled with languages and platforms, the result is software systems that are unfathomably broad and complex—arguably too complex for humans to fully understand. Yet it is still humans who create software, and more critically, humans who must evolve software. The question of how we go about building and extending these systems safely, securely, and efficiently is now of paramount concern. 

Enter Software Intelligence Tools

Software intelligence (SI), as defined by Wikipedia, is “insight into the structural condition of software assets, produced by software designed to analyze database structure, software framework and source code to better understand and control complex software systems in Information Technology environments.” In more specific terms, it’s the ability to examine software in detail, decompose the structure of the software and its requisite components, store that information in a coherent fashion that then allows for further analysis of relationships and structure between various sets of other software and components. Ideally, this would traverse different languages, frameworks, abstractions, architectures, data models and underlying infrastructure. The most important aspect of any such technology is the ability to comprehensively store and reference the relationships and dependencies between these elements of software. 

Put simply, to properly analyze and understand potential impacts when building or modifying software, one must fully understand all the dimensions of dependencies. However, as mentioned earlier, this is not reasonably possible in any manual, human-directed fashion. While large-scale efforts to manually map systems are common objectives of software modernization projects, the result is a time-limited, static understanding of an evolving system, usually of highly inconsistent fidelity and accuracy. Even with the use of domain-specific SI tools (e.g., application performance monitoring (APM) software), the results fail to peer deeply enough into how software is actually running. Outside of such projects, attempts at comprehensive cataloging detailed “as-built” documentation is generally limited, with intelligence and analysis tools often siloed into areas such as static source code analysis, security profiling tools, and APM systems. This results in disconnected sets of relationship and dependency data, again, of highly inconsistent fidelity.

These tools provide degrees of SI, but a second generation of comprehensive, unifying platforms are required to bridge the gaps between these systems and end ineffective manual discovery and documentation practices. To understand why comprehensive profiling, aggregation, and analysis is required, it helps to understand the scale of the problem. A typical enterprise application consists of millions of relationships, or in graph terminology, nodes and edges.  

While these systems capture and organize information with speed and accuracy beyond that which is possible with manual or ad hoc methods, the power in such systems comes not merely from having highly detailed information; it’s the ability to rely on the CSI system to provide analysis of the data, provide focused actionable information, and allow users of the system to quickly profile for impact. This includes potentially wrapping controls around sensitive elements (such as a particular class or method). Most importantly, CSI should do this across application, endpoint and data-layer boundaries, and in such a way that can represent the “as deployed” state, not just potential relationships, as might be captured though methods like static source analysis. Finally, an CSI system should enable access to analysis information in a way that’s accessible to software architects, developers, and other ecosystem participants.  

Dogs and Cats, Living Together…

This article has, so far, described the current industry situation, and illustrated the comprehensive nature of CSI . However, it’s important to describe the undesirable possible future that awaits the world should CSI not be embraced. The first, already apparent issue, is the break/fix cycle. Bluntly put, breaking software, and the often-ugly unintended consequences of changing anything in a complex software system, has become the single largest impediment to innovation and change today. 

In the past, abstraction models were implemented for the purpose of simplifying interaction between software, or to ease complexity when building new functionality. Increasingly, abstractions are being implemented for the sole purpose of fault or change isolation: better to wrap new code around old code than risk breaking unknown things further down in the stack. One only need look at the perpetual fear of patch and release upgrades, in everything from frameworks to operating systems, to understand the principal concern of software changes breaking software. The impact of the break/fix cycle on innovation and productivity cannot be understated.

The second issue is indelibly linked to the first: complexity itself is becoming the core risk factor in software architecture and engineering. Most architects don’t understand the true scale of complexity within the systems they’re responsible for building and maintaining. The typical block architecture diagram that fancifully paints an organized picture, has led to a critical disconnect between assumed and actual “as built” state for many, if not most, software systems. There are two significant outcomes resulting from this situation: overbudget or failed modernization efforts, combined with a “head in the sand “attitude about the proliferation of complexity.

If the industry doesn’t get serious about CSI and accept that modern software requires a systematic, automated approach to capturing and understanding complexity, software will eventually be unable to move forward. Software engineers will live in perpetual fear of change while, paradoxically, piling on more complexity to avoid touching existing things. In the end, Moore’s law will have bought us the ability to create the unfixable. Software will begin eating itself, to build on Mr. Andreesen’s prediction.

So, you’re telling me there’s a chance…

The alternative to the untenable situation described above is a world where changes don’t result in unforeseen breakage. In this world, backed by comprehensive coverage through a more advanced CSI, it’s possible for developers to easily verify, across boundaries (applications, interfaces, projects, etc.) the impacts of various modifications and improvements. Architects will be able to catalog and analyze large scale change, using an accurate representation of software systems and components. Development leads can receive accurate proactive warnings about potential breaking impact to systems into which they’d otherwise have no visibility. DevOps pipelines can participate in analyzing and reacting, through integration automation with CSI, to possible negative impacts. And all of this can potentially integrate and inform other domain-specific systems, such as APM tools.

Over the coming years, good SI, much like BI, will prove to be one of the defining characteristics of successful companies. The scenario above, describing the consequences of continuing the status quo, is clearly not a viable end state. The software industry, and business in general, will embrace comprehensive Software Intelligence, i.e., CSI, because there’s no path forward that doesn’t include SI. The key question for businesses that depend on software is—and this applies not only to businesses that write software for their own operational purposes but also to software suppliers, consultants, and outsourcers—is how quickly they can adopt SI 2.0 and begin to outpace their competitors.