It has been 15 years since the distributed version control system Git was released. Linus Torvalds, known for developing the Linux kernel, first released Git on April 7, 2005.
Today, it is “arguably the world’s most powerful distributed control system,” according to GitLab.
“In the 21st century, software excellence is the new operational excellence, making it crucial for companies to find ways to rapidly innovate. Git allows developers to move more rapidly and deliver value to their customers faster,” said Sid Sijbrandij, CEO of GitLab.
History of Git
According to developers Scott Chaco and Ben Straub, who co-wrote the book “Pro Git” back in the early 2000s, the Linux kernel project used the distributed VCS BitKeeper to handle maintenance and keep track of changes. In 2005, BitKeeper was no longer available for free, so the Linux community decided to develop its own tool based on lessons learned from BitKeeper. The goals of the new tool were to be fast, simple, provide strong support for non-linear development, be fully distributed and be able to handle large projects. Thus, Git was born.
“Since its birth in 2005, Git has evolved and matured to be easy to use and yet retain these initial qualities. It’s amazingly fast, it’s very efficient with large projects, and it has an incredible branching system for non-linear development,” Chaco and Straub wrote in “Pro Git.”
Version control systems are used to record changes over time so users can go back and look at specific versions.
“A Version Control System (VCS) is a very wise thing to use. It allows you to revert selected files back to a previous state, revert the entire project back to a previous state, compare changes over time, see who last modified something that might be causing a problem, who introduced an issue and when, and more. Using a VCS also generally means that if you screw things up or lose files, you can easily recover,” Chaco and Straub wrote.
Types of version control systems
There are different types of version control systems available today. A local version control system is a simple database that allows users to copy files into another directory. According to Chaco and Straub, this is error prone because it is easy to forget where you put files.
A centralized version control system provides a single server that enables others to collaborate. While this is better than local version control systems, the setup can cause problems. For instance, if the server goes down, no one can access anything or save any changes they were working on.
A distributed version control system, like Git, provides full copies of a repository, including its history, so if a server dies it can be backed up and restored.
Additionally, traditional SCM and VCS systems didn’t allow for, or limited, branching and merging, which resulted in broken builds because everyone was working in the main line and prevented any parallel development from happening.
“Unlike many other VCSs, Git encourages workflows that branch and merge often, even multiple times in a day,” Chaco and Straub wrote in their book.
However, because Git is distributed, everyone ends up with a copy of the repository on their own laptop to do whatever they want with it. Additionally, there are no authentication or verification measures in native Git, which can cause security issues, according to a December blog post by Perforce. In order to add an extra layer of security, users can use hosting tools that have safeguards such as user authentication and encryption features.
“Git has had a huge impact on open-source software development because its decentralized nature puts the same tools in everyone’s hands. This gave developers flexibility and transparency they previously did not have, making it easy to document and collaborate on software development of all kinds,” said Jeff King, distinguished software engineer at GitHub.
According to Chaco and Straub, what sets Git apart from other version control systems is the way it handles data. While other systems like Subversion, CVS, Perforce and Bazzar store information as a list of file-based changes or as delta-based version control, Git handles data as a series of snapshots.
“With Git, every time you commit, or save the state of your project, Git basically takes a picture of what all your files look like at that moment and stores a reference to that snapshot. To be efficient, if files have not changed, Git doesn’t store the file again, just a link to the previous identical file it has already stored,” the authors wrote.
Git today
The latest version, Git 2.26, was released last month with protocol version 2 as the default. Other features included new config options, updates to git sparse-checkout, and performance improvements. Additionally, the partial clone feature was introduced to provide performance optimization to Git and enable Git to function without having a copy of the repository.
There are also a number of tools and companies built around the distributed version control system. GitLab is a DevOps tool provider with a Git-repository manager. GitHub provides hosting for collaborative version control using Git. More recently, Microsoft announced Scalar, a .NET Core application designed to maximize Git command performance with recommended config values and background maintenance.
“I think the best features are yet to come,” Junio Hamano, Git maintainer, said in an interview with GitHub’s King. “But what I’m more proud of, as the maintainer, is how our development community came to be, full of great developers (I won’t name names) from different backgrounds, working for different employers, having different agendas, but still work together to make progress. I’m also proud about how I, and other longer-time contributors, trained ourselves and others to describe the changes we’re making.”