With software configuration management (SCM), gone are the days when developers would email their code changes back and forth, waiting patiently for review. SCM has certainly changed the process for code review, feedback and collaboration, but it is the rise of distributed version control systems like Git that allow software teams to work faster than ever.
There is no shying away from it; Git appears to be the de facto standard for working with code changes, forking a repository, creating pull requests, collaborating, and deploying software. The only question teams should be asking is, what flavor of Git will work for them? Will it be GitHub, the developer-preferred version control system, or GitLab, an open-source git repo with options for enterprise, or maybe Atlassian’s Bitbucket, a Git solution for massive repos? Or, teams might even consider proprietary software like Perforce, which offers proprietary version control software for storing large binaries.
RELATED CONTENT: Why SCM is crucial to a developer’s workflow
All of these tools use Git as a back end, but for many companies today, Git is the “end-all” tool for SCM and version control. And as open-source software continues to rise in recent years, developers need that open-source ecosystem so they can work in parallel and contribute to their projects in meaningful ways.
Git is at the center of this software evolution; and while SCM changes and the culture of open-source continues to grow, software experts expect Git will remain in the spotlight.
Evolution of SCM
In the beginning, SCM was just a simple system. As developers needed to collaborate with others on different systems, Centralized Version Control Systems (CVCSs) were formed, and these systems have a single server containing all of the versioned files, with a number of developers that check out files from that central location.
These systems — like CVS, Subversion and Perforce — have one single server that contains all these versioned files, and for a while, this was the standard for version control.
According to Jeff King, infrastructure engineering manager at GitHub, prior to about 2005, SCM systems were very centralized. This meant that someone set up the server and they were the person in charge of bringing up the source code, maintaining the tool, and more. In order to use the tool, developers had to get permission from the owner of the centralized system. This created a big barrier, since the workflow of open-source systems is highly decentralized, he said.
Around 2005, there was an “explosion” of distributed version control systems, said King. Distributed version control systems are systems where there is no central authority for a project in terms of the technology. Everyone gets a copy of the history of an open-source project and everyone has access to the same tooling. Developers can submit a change one time and then leave, and they will still have the same access to the same tools as developers that frequently submit changes, according to King.
Plus, distributed version control systems let developers have several remote repositories to work with, which means they can collaborate globally with different groups of people within the same project.
Of these distributed version control projects, Git was one of the most commonly used one, and it certainly has taken off in terms of developer adoption. According to Edward Thomson, senior program manager at Microsoft, Git provides every developer with an entire history and entire branching structure on their development machine, which lets them enable powerful workflows based around simple branching and pull requests, said Thomson. And, it gives developers more insight into their software projects more quickly, he said.
While Git is widely adopted, the biggest fundamental limitation with it is its scaling issue. With Git, developers all get a copy of the history of a project, and everyone has access to run the system on their own. However, large repositories can be cumbersome since each developer would have to make a complete copy of that history and do all of the requests locally on their workstations, said King. For most projects, this isn’t a problem, but for those large projects with an entire code base in one single project, lots of developers working in that huge repository can create some scaling issues.
“What we see now is people trying to take a hybrid approach to centralized and decentralized systems,” said King. “It’s easier to modify a decentralized system to optimize some of these cases to use centralized resources.”
What King means is, companies are using a decentralized tool like Git, but under the hood there is a centralized server that is meeting some of those scaling issues, he said. To the developers, it looks like they have a complete copy of the code history, when really they have only touched a part of the history, with the tool hitting that central server and filling in the gaps “on the fly” when it needs to, he said.
Tom Tyler, a senior consultant at Perforce Software, agreed that Git’s difficulty handling larger repositories is a big limitation when it comes to software configuration management. He said Git was designed to perform well at a small scale, so the challenge for customers or users with large systems and a lot of interdependencies is that it can be difficult to manage all their dependencies among all the modules in the system.
For customers that have monolithic systems and highly modular systems, they also opt for dependency management using a centralized system, said Tyler. These systems are aware of all the components and dependencies, which Git can’t exactly do that easily, he said.
“There are a few problems with Git for storing the large binaries, it just wasn’t designed for it,” said Tyler. “It’s a great developer tool and people love it and it does have a lot of great features.”
Today, Git isn’t just a tool for developers. In the past, the operations side of software development was very manual, and had a hands-off style of working. But in the era of DevOps, SCM best practices are evolving so SCM is even used by the operations teams and the QA/testing teams, according to Tyler. These teams understand what version control is and how it is used, where in the past they didn’t interact with the system directly, said Tyler. This is changing for many companies and customers today, he said.
Git-ing in control of SCM
The open-source version control system and tool, Git, is probably the most popular SCM and version control system in the industry right now, according to Rahul Chhabria, principal product manager for Bitbucket Cloud at Atlassian. It has changed the concept of having developers work with very large working copies, to having smaller pieces of a repository and being able to only work on the things that you want to work on, said Chhabria.
One of its benefits as a distributed version control system, is it lets developers work anywhere in the world, even if they don’t have access to the internet. Developers can make those changes, and when the get back online, they can push up those changes and Git will automatically track what’s changed elsewhere. Chhabria said this is especially important because software development today is all about shipping software fast.
Just like GitHub’s King cited, Git’s limitations include its difficulty handling large files. To combat this, some companies like Atlassian are contributing to a standard called Git LFS, which lets them extend Git. For instance, what Atlassian has done is taken the ability for large files to be stored in a remote location, so they will not weigh down a core repository, said Chhabria. When developers want to make a change, the calls are separated and are all on Git, he said.
Because of this, many companies today are creating their own highly scalable private Git repository, so that it’s their own repository on the Git standard, according to senior analyst with Forrester Research, Christopher Condo. Some companies don’t want to put their code on GitHub because then it means everyone has access to the code, or they can get access to the code. But by supporting the Git standard, said Condo, developers can take their code off of one repository and go to another Git standard repository of their choosing.
“[Enterprises] are opening up to competition but they realize they have to use this open standard,” said Condo. “The idea of being open-compatible and the ability for developers to say, ‘I don’t like your [repo] I’m going to develop on top of it and go somewhere else, it seems important. Developers don’t want to be locked into a particular vendor.”
With the rise of distributed software development teams, Git is sort of designed for the open-source way of working, and Condo notices several companies adopting this open-source philosophy, and he sees it spreading from the open source community into enterprises. By supporting the Git repository standard, teams can take their code and go to another Git standard repository of their choice. Atlassian, Microsoft, Amazon, and Red Hat are just a few of these organizations that understand the open standard and are adopting the Git standard so customers and developers can avoid being locked into a particular vendor, according to Condo.