It’s not uncommon for a software application today to consist of 80% or more open-source components, which explains enterprises’ growing use of repository managers, solutions that help them govern what open-source components are being used by their developers.

“Modern IT systems are built with a wide range of components and code elements from all over the place,” said Mark Driver, VP at Gartner Research. “It’s very hard to manage the pedigree of these components and manage their risks without tracking and controlling them.”

These days, “The general composition of an application is 80 to 90% third-party, open-source components, and there is a small portion of the application that you write yourself,” said Jason van Zyl, CTO and founder of Sonatype, a provider of the Nexus repository manager. “Open-source developers take their source code, they build, they turn [it] into components and put them in a repository, and then [the components] get consumed inside the organization.”

In the open-source ecosystem, van Zyl said developers place these third-party components into repositories (such as Sonatype’s central repository) that sit outside their organization’s firewall. Nexus, and repository managers like it, sit inside an organization’s firewall and act as a gateway between an organization and these open-source repositories. The issue “really falls into compliance and governance, which is what most organizations want to do with repository managers,” he said. “They really want to govern their use of open-source components inside their organization.”

So if a developer inside an organization wants a particular open-source component, typically the way it works, van Zyl explained, is that it makes a request that goes through a repository manager such as Nexus. Nexus then fetches it from (in this case) Sonatype’s central repository and brings it back, caching it inside the organization. “So if the next developer makes a request for that component, they have a copy of it sitting inside their organization,” he said.

Organizations may not necessarily want their developers to grab everything that’s available in repositories, van Zyl said. Because of this, repository managers often also provide a level of security around the access to open-source components found outside of an organization. “Basically, they provide a private staging area that acts as a control point between the Wild West nature of the Internet and the controlled nature of development inside the firewall,” Driver said.

Nexus is agnostic to the component type, van Zyl said, but Java developers often use Sonatype’s repository manager since that is primarily where their ecosystem is focused. “We deal with the Java ecosystem where the binary components are; people typically call them jars,” he said. “But we’re rapidly expanding out into the .NET and the Ruby communities. Nexus can deal with Ruby Gems, for example, or .NET components.”

Van Zyl said Sonatype’s customers have started asking it to deal with other component types because its solution focuses on the integrity of a supply chain, tracking the whole path through which an application is developed and put into production. It gives developers the ability to replace a problematic component with a higher-quality one, and to get their application back into production quickly. “We keep track of the popularity of components and security vulnerabilities associated with components and licensing,” he said. “We track it through the development of the application and help fix problems while the developer is working on the application. We also keep track of the information when the product is done and is being deployed into production.”

Much like a car that is recalled after release due to a defect, if an application is deployed and is later found to have a problem, repository managers such as Nexus can help organizations trace where problems originated. “You need to know where that component came from, you need to know how you can replace it, and you need to provide the ability for a developer to go back and actually fix the problem and remove it from the application,” van Zyl said.

Now that applications are made of such a large percentage of open-source components, van Zyl said there’s a massive need to manage those components through the use of repository managers. It’s “a space that we have been in for some time,” he said. “The market, overall, has woken up to the need to do the repository-management side, and also expand that across your entire life cycle to reduce risk and improve development efficiency.”
Repository managers help with continuous delivery
Repository-management solutions are becoming a core element in the DevOps infrastructure, according to Driver. These solutions are “helping with standardization, efficiency and reliability,” he said. “I think it’s mostly an issue of feeding continuous integration tools. These are two complimentary trends, but they are defined by one another.”

Some repository-management solutions can help enterprises keep track of what their Git developers (or those who are using Git) are doing. “If they are used to feed Git or used as a Git mirror, then [the solutions can help],” Driver said. “But Git developers can use them independently, of course.”

“There is this whole notion of continuous delivery where people develop their source code, build and test, produce their product, and ship it to customers,” said Christopher Seiwald, CEO and founder of Perforce, a provider of the Git Fusion repository manager.

“Repository management…centers around Git quite a bit…because Git is a tool that works very well on the individual and small group scale,” he said. “It provides the basics of version control but not the whole bigger idea of configuration management, of keeping track of how perhaps a million source code files are going to fit together or, for that matter, what products they build, and what goes out the door.”

Seiwald said that Git Fusion keeps track of everything that a developer would put in a Git repository as well as the repositories themselves. With Git, repository management is an issue because organizations often do not know how many of the developers are using Git, he said.

“When we were first started to shape Git Fusion, we talked to a number of release managers, systems administrators and IT directors,” Seiwald said. “We asked them how many Git developers they had and how many of their developers were using Git. An uncomfortable portion of them said they didn’t know.”

Seiwald said it was clear that they were not happy to not know what was going on because it’s their job to know about all the projects, all the different components, and who is using what. “Git Fusion, in part, solves that issue so that they can have a total view of the system,” he said. “They can know what everyone’s working on and manage their releases more efficiently.”

Seiwald said Git Fusion was created because the company saw the trend of more and more developers using Git to get their components. “It’s really part of our responsibility to make sure that [project managers are] able to have the same version-management capabilities across those repositories as they do across everything else,” he said.

“As we see the market moving toward things like continuous development and continuous integration—and now continuous delivery—we need to stay ahead of that as well. So not only are you managing all of your code and including code that’s in Git repositories, but you’re also managing other assets. So people like project managers are able to version their project plans in the same version platform.”

Seiwald said Git Fusion has integrations with Adobe Photoshop, Eclipse and Microsoft Visual Studio. “We want to make it so that, where people are doing their work, they can then access Perforce’s repository.”
Binary repository management
The reason that binary repository management is needed today is because most of the source code that developers are writing ends up as binary code, according to Fred Simon, cofounder and chief architect of JFrog, which creates the Artifactory binary repository manager. These “executables or libraries or modules are prebuilt and redistributed either to the end user or to other developers,” he said. “So there is always this type of transforming the source code into binary code.”

Today, there are millions of those packages of binary code; it is virtually impossible to manage by hand. This is where the open-source version of Artifactory can help developers manage that binary code. “We provide users with a page where we can manage and control all the open-source bits they have downloaded from the Internet and all the ones that they make themselves,” Simon said.

JFrog recently announced that Artifactory Pro, its enterprise version, now integrates with Black Duck’s open-source licensing repository, giving enterprises continuous control over open-source licensing. Enterprises “want to control the open-source licensing, so this is [the reason for] the Black Duck integration,” Simon said.

“They want to make sure that all the open-source binaries that are coming inside their organization are validated by Legal to be usable. The integration with Black Duck is really a win-win situation for both companies because we now provide continuous integration flow of the [open-source licensing] verification process.”

JFrog has also recently introduced Bintray, which the company described as a social SaaS platform for developers to distribute binaries. It is a separate product from Artifactory but can integrate with it. Developers can use Bintray to store, manage and control the flow of their binary artifacts, said Simon. “They can use Bintray to distribute the binary bits that they created,” he said.

Simon described GitHub as a social platform because developers are interacting and sharing code with other developers. “What you are interested in is, ‘What is the latest piece of code that you wrote or are modifying and communicating with the internal source code?’ And we know that developers need the same kind of level of interaction and integration for their binaries. GitHub is very good at managing sources and the social graph of the sources, and we do the same for the binaries,” he said.

According to Simon, JFrog is the only company that is providing this type of “GitHub for binary” solution. Developers need this, he said, because GitHub doesn’t manage binaries anymore.

“[GitHub] actually used to have a small feature that people were using [in which] you could put binaries and share binaries with your friends,” he said. “They removed this feature because they don’t like to actually manage binaries in GitHub. So, for us, it’s really good. We have great technical integration with GitHub. And so now every developer in GitHub who wants to actually distribute their binary can use Bintray.”