As the use of open-source software (OSS) continues its year-over-year growth, the biggest area for innovation and open-source adoption is now AI.
But the growth of OSS is in every area, relied upon by companies for a wide range of business-critical applications, including data and database management, containers and container orchestration, and DevOps and SDLC tooling.
According to the 2023 State of Open Source Report from OpenLogic by Perforce Software, 80% of organizations increased their use of open-source software over the last 12 months.
“The big piece here is, the number one reason to use open source is for access to innovation,” said Javier Perez, the chief OSS evangelist and senior director of product management at Perforce and one of the leading authors behind the report.
RELATED ARTICLE: How to ensure open-source longevity
And AI is the new ‘king of the bill’ of OSS, according to the report. “The AI overtaking container technology was probably what stuck out the most when looking at the data,” Perez added.
Explosion of data an AI driver
The need to juggle and draw insights from rapidly increasing quantities of data has been a driving factor for AI.
“Demand for services powered by AI/ML/DL technologies is exploding,” said Stefano Maffulli, executive director of the Open Source Initiative. “The vast amounts of data these applications ingest give rise to serious implications when it comes to licensing and privacy in this ‘growth at all costs’ era. The [OSI] is researching the AI/ML/DL space to help enterprises and individuals get clear definitions of their rights and obligations when it comes to data and AI systems.”
Start a conversation or open your laptop and it doesn’t take long for OpenAI’s GPT-3 model to come up. ChatGPT, DALLE-2, and more models are making this a big year for AI adoption. While users have to pay for regular use of GPT-3’s offspring mentioned above, the core GPT-3 AI model remains open source. There’s even talk about a GPT-4 on the horizon.
Despite all of the new players in the open-source AI field, Google’s TensorFlow, which offers a flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML, is still the most used project.
While this project has been around since 2015, it received some major updates last year, including enhancements to DTensor, the completion of the Keras Optimizer migration, the introduction of an experimental StructuredTensor, a new warm-start embedding utility for Keras, and much more.
PostgreSQL tops in OS databases
At the same time that AI is generating and analyzing vast amounts of data, the data is now more commonly going to open-source database technologies.
“We’re talking about very large volumes of data that has to have to go somewhere. So it will go to Apache Kafka or Apache Spark or Cassandra, some of those technologies that are becoming more and more popular,” Perforce’s Perez said.
The three major players in the open-source data technologies field – PostgreSQL, MySQL, and MongoDB – have secured the top three spots over the last several years.
According to OpenLogic’s OSS report, MySQL and PostgreSQL swapped places by a few percentage points and now PostgreSQL is the most used data technology. PostgreSQL has seen the most growth, passing MongoDB last year, and inching out MySQL this year to secure the top spot.
Marc Linster, CTO at EnterpriseDB – whose product is based on PostgreSQL – said that the database isn’t even used to its full potential today.
“It’s easy to use Postgres in 99% availability of SLAs. And we help customers get to five-nines of SLA. And a lot of people don’t understand or don’t realize that you can do that reliably with Postgres today,” Linster said.
“This is the thing that happened with Linux a good while ago. Linux started at the print server. Then from the print server to the file server, then to the department server, and today it runs everything.” Linster said. “Well, the same thing is happening with Postgres.”
Kubernetes, containers see growth
The other areas of open source that have seen considerable growth are Kubernetes and container technologies. The CNCF found that within its community, Kubernetes continues to mature and have the largest contributor base of any project.
Kubernetes 1.26 was released at the end of 2022 with many storage improvements, including CSI migration for Azure File and vSphere graduating to stable. Users also gained an improved metrics framework extension and Component Health Service Level Indicators to alpha.
The maturing technology also had a podium finish by the end of the year. Kubernetes usage increased by 5% in the past year, and with about 23% of the votes, it has become the third most-used cloud-native technology, according to OpenLogic’s OSS report. Aside from OpenStack, whose usage decreased by 10% compared to the last year, all other cloud-native technologies have seen an increase in the last twelve months.
The report also found that the usage of containers and container technology has grown significantly — from 18% to 33%. This trend is uniform across organizations, regardless of their size.
“As Kubernetes matures, many organizations turn to service mesh technology and those projects in CNCF like Envoy, Cilium, and Istio continue to cultivate large contributor communities to meet the demand,” Aniszczyk added.
Backstage moves front and center with the help of CNCF
One important project that has quickly moved up the ranks in the CNCF is Backstage, which enables developers to bring together their organization’s tooling, services, apps, data, and documentation into a single UI.
“Backstage a year ago barely made this list and continues to grow, solving an important pain point around cloud-native developer experience,” Chris Aniszczyk wrote in a blog post that identified the most important open-source projects in the CNCF and Linux ecosystems last year.
The Software Catalog, which is the core feature of the project, makes it simple to create service blueprints that can be shared between teams. It also enables teams to keep track of the ownership and metadata of all services within the engineering organization.
The project was originally created at Spotify in 2016 and was used as the company’s mission-critical tool for containing software chaos, empowering engineers to work faster and more efficiently. It entered the CNCF Sandbox in September 2020 and was eventually voted in as a CNCF incubating project last March.
“Software stacks are growing larger and more complex by the day – Backstage was built to address issues like SaaS sprawl and cloud-everything which can make the developer experience complex,” said Erin Boyd, CNCF TOC member and project sponsor, in a blog post.
Backstage has seen great progress since joining the CNCF, with growth in core components, features, plugins, adopters, contributors, and community engagement. This has resulted in updates, refinements, documentation, deprecations, and stabilizations to the Software Catalog, Software Templates, TechDocs, and API Reference.
Now, the project is utilized by hundreds of publicly listed adopting companies, including American Airlines, Expedia Group, HelloFresh, Netflix, Peloton, Roku, Splunk, Wayfair, Zalando.
Other macro-trends observed by CNCF’s Aniszczyk were that the contributor base of OpenTelemetry is expanding, making it the second-fastest-growing project in the CNCF environment. Also, he stated that the usage of GitOps remains vital to the cloud-native environment, with projects such as Argo and Flux continuing to attract numerous followers and recently both achieving graduation from the CNCF.
OSS challenges persist
While OSS use is expanding at most organizations, some challenges still persist.
“Clearly, more technical support is needed for open-source technologies, as personnel experience and proficiency is highly ranked again this year as a support concern across organizations regardless of size,” Perez said. “In-house support of OSS requires expert-level knowledge of not just one technology, but multiple technologies that form software stacks.”
Rod Cope, CTO at Perforce Software, added that open-source communities are not time-bound by any SLAs, which means one could be waiting days or even weeks to get technical support if there are skill shortages in an organization.
The security aspect of open-source is number one but that is always going to be the case, Perez predicts. “It’s just human nature and no matter what you do they’re going to say that’s the most important challenge,” Perez said.
Another challenge is that like most technologies, not every open-source system is created equally, and not every system is as open as it claims to be. When using a “captive open source” project, an organization runs the risk of being locked into a system.
Captive open-source projects are the projects that were created by a company that now has a tight grasp over the fate of the project, Linster explained. When they open-sourced the project, they made the source code accessible to the user, but the licenses can still be very restrictive.
“It sounds that the code is readable, but the limitations on how the code can be used are significant. And those are also not recognized OSI licenses, so they’re not really open-source licenses in source available,” Linster said.
They can change the license and can decide which features go in. It’s only their decision how much these features cost, what the new license for those features is, Linster added.
Luckily, most areas of open source have plenty of alternatives to choose from by now.
“There are a number of companies commercializing open-source databases so if you use one, then you pay for what is called open core, so there are proprietary additional features and you might get locked in,” Perez said. “But, at the same time, you can see in the OSS report that there are another 20 open-source data technologies out there. It’s no longer, ‘I need a database and Oracle is the enterprise database.’ Now there are so many options.”
Top Open-Source Projects at CNCF and Linux Foundation
According to the Cloud-Native Computing Foundation (CNCF), here are the top 10 projects at the CNCF and Linux Foundation last year based on the number of commits, authors, and comments/pull requests.
CNCF
- Kubernetes
- OpenTelemetry
- Argo
- Backstage
- gRPC
- Prometheus
- Envoy
- Cilium
- Istio
- Dapr
Linux Foundation
- Linux
- Kubernetes
- OpenTelemetry
- Argo
- Hyperledger
- Zephyr
- Node.js
- Backstage
- Jenkins
- gRPC