The Linux Foundation is addressing structural and security complexities in today’s modern software supply chains with the release of the ‘Vulnerabilities in the Core,’ a preliminary report and census II of open-source software. 

The report was put together by the Linux Foundation’s Core Infrastructure Initiative and the Laboratory for Innovation Science at Harvard (LISH). 

RELATED CONTENT: 
Report: The benefits of open-source software go beyond cost
The realities of running an open-source community

“The Census II report addresses some of the most important questions facing us as we try to understand the complexity and interdependence among open source software packages and components in the global supply chain,” said Jim Zemlin, executive director at the Linux Foundation. “The report begins to give us an inventory of the most important shared software and potential vulnerabilities and is the first step to understand more about these projects so that we can create tools and standards that results in trust and transparency in software.”

Based on the foundation and lab’s analysis, the team found the following ten packages as the most used free and open-source software packages. 

  1. Async: A utility module which provides straight-forward, powerful functions for working with asynchronous JavaScript. Although originally designed for use with Node.js and installable via npm install async, it can also be used directly in the browser.
  2. Inherits: A browser-friendly inheritance fully compatible with standard node.js inherits. 
  3. Isarray: Array#isArray for older browsers and deprecated Node.js versions. 
  4. Kind-of: Get the native JavaScript type of a value.  
  5. Lodash: A modern JavaScript utility library delivering modularity, performance & extras. 
  6. Minimist: Parse argument options. This module is the guts of optimist’s argument parser without all the fanciful decoration. 
  7. Natives: Do stuff with Node.js’s native JavaScript modules. 
  8. Qs: A querystring parsing and stringifying library with some added security. 
  9. Readable-stream: Node.js core streams for userland. 
  10. String_decoder: Node-core string_decoder for userland. 

The report also details the most commonly used non-JavaScript packages, which includes:

  1. com.fasterxml.jackson.core:jackson-core: A core part of Jackson that defines Streaming API as well as basic shared abstractions.
  2. com.fasterxml.jackson.core:jackson-databind: General data-binding package for Jackson (2.x): works on streaming API (core) implementation(s). 
  3. com.google.guava:guava: Google core libraries for Java.
  4. commons-codec: Apache Commons Codec software provides implementations of common encoders and decoders such as Base64, Hex, Phonetic and URLs
  5. commons-io: Commons IO is a library of utilities to assist with developing IO functionality.
  6. httpcomponents-client: The Apache HttpComponents™ project is responsible for creating and maintaining a toolset of low level Java components focused on HTTP and associated protocols. 
  7. httpcomponents-core: The Apache HttpComponents™ project is responsible for creating and maintaining a toolset of low level Java components focused on HTTP and associated protocols.
  8. logback-core: The reliable, generic, fast and flexible logging framework for Java. 
  9. org.apache.commons:commons-lang3: A package of Java utility classes for the classes that are in java.lang’s hierarchy, or are considered to be so standard as to justify existence in java.lang.
  10. Slf4j:slf4j: Simple Logging Facade for Java.

Based on these packages, the researchers were able to determine some common problems. For instance, they found the naming schema for software components were unique, individual and inconsistent. “The effort required to untangle and merge these datasets slowed progress on the current project significantly. Despite the considerable effort that went into creating the framework to produce these initial results for Census II, the challenge of applying it to other data sets with even more varied formats and naming standards still remains,” the report stated.

“Open source is an undeniable and critical part of today’s economy, providing the underpinnings for most of our global commerce. Hundreds of thousands of open source software packages are in production applications throughout the supply chain, so understanding what we need to be assessing for vulnerabilities is the first step for ensuring long-term security and sustainability of open source software,” said Zemlin.

Additionally, there is an increasing importance of individual developer account security. A majority of top packages were found to be hosted under individual accounts, which can mean they are more vulnerable to attack.  

Lastly, the researchers found the persistence of legacy software in the open source space. According to them, this can lead to compatibility problems, and financial and time-related costs. 

“FOSS was long seen as the domain of hobbyists and tinkerers. However, it has now become an integral component of the modern economy and is a fundamental building block of everyday technologies like smart phones, cars, the Internet of Things, and numerous pieces of critical infrastructure,” said Frank Nagle, a professor at Harvard Business School and co-director of the Census II project. “Understanding which components are most widely used and most vulnerable will allow us to help ensure the continued health of the ecosystem and the digital economy.”

In order to determine the top packages and projects, the foundation worked with software composition analysis and app security companies like Snyk and Synopsys.

“Considering the ubiquity of open source software and the essential role it plays in the technology powering our world, it is more important than ever that we take a collaborative approach to maintain the long term health of the most foundational open source components,” said Tim Mackey, principal security strategist for the Synopsys Cybersecurity Research Center. “Identifying the most pervasive FOSS components in commercial software ecosystems, combined with a clear understanding of both their security posture and the communities who maintain them, is a critical first step. Beyond that, commercial organizations can do their part by conducting internal reviews of their open source usage and actively engaging with the appropriate open source communities to ensure the security and longevity of the components they depend on.”