Version Control Systems (VCS) – also known as Source Code Management (SCM) or Revision Control System (RCS) – are an increasingly critical facet of businesses, and not only those who write software for a living. According to a Future Market Insights report, in 2021 the VCS market is projected to reach almost $630m in total demand and be growing at almost 11% yearly until 2031. The rising demand is being driven by a combination of improvements in key technological components – such as automation – and increasing scope of integrations into other applications – such as Cloud infrastructure.
One aspect of VCS that has a long history of providing value is the cost savings and efficiencies that effective management brings, especially to large software teams – features such as change tracking over time, analysis of merge operations, and integration rollback options. In addition to raising confidence in the software tracked by the VCS, these critical capabilities have been key enablers in the industry-wide adoption of disparate and independent teams jointly contributing to large projects – a key aspect of DevOps culture.
Version Control Systems
Version Control Systems generally break down into two categories, although some solutions split the difference to support blended models of both:
- Centralized – a server is the repository and tracks all changes and versions, produced by clients, over time
- Distributed – a repository still tracks all changes and versions, however, each developer keeps a local copy of the record, often with an additional remote copy being held as a source of truth
Distributed Version Control Systems (DVCS) have experienced a rise in popularity, largely in tandem with the growth of DevOps culture to recognize value in developer independence. In this aspect, the ability to go offline or work absent a team for periods of time supported by DVCS can be a make-or-break capability in a DevOps organization.
Let’s review some of the key VCS options available:
Concurrent Versions System (CVS) Version Control
CVS was one of the earliest VCS systems, first published in 1986 – predated only by the IBM OS/360 SCCS – and uses a client / server model for developer interaction. Using its unreserved checkout strategy, CVS supports multiple concurrent changes to a single file without conflict.
Although not in commonplace use anymore, CVS established the concepts and strategies that have provided the foundation for modern VCS ever since.
Licensed under GPL
As a Distributed VCS, Git provides developers with a full clone of the repository – in addition to transparently implementing support for a backup and restore structure, this approach supports a wide range of development workflows from very small to very large projects such as the Linux kernel. Also being distributed allows operations in Git to take place almost exclusively on the local system, which makes each operation much faster than waiting for a remote server. During active development, Git isolates changes into code branches that are then eligible to merge together.
Git has become synonymous with DevOps culture – to the point of coining a new term “GitOps” – and a major player in the VCS space. Although options exist to self-host, Git is commonly implemented as SaaS from a few direct options – such as GitHub and GitLab – to white label services – such as BitBucket. Many organizations find that this pattern helps to lower barriers to entry for first-time VCS implementation as well as for migrations from other products.
Licensed under GPL v2 (and others)
Apache Subversion (SVN)
Subversion adheres to a dedicated client / server model to provide a centralized structure for change management. Directories and files alike are versioned as first-class objects, supporting operations from creating, modifying, copying, deleting, and renaming. Commits are fully atomic and the cost of a change is proportional to its size (i.e. complexity) instead of the raw volume of data – this making branching a much cheaper operation.
Although a client / server model can increase the total cost of ownership (TOC) – including infrastructure to run the server and staff to support it – the approach that SVN takes toward change management makes it especially effective for a particular use case: large amounts of code, especially binaries. Internal algorithms support understanding binary files as text, allowing for efficient understanding of changes over time and more efficient operations across changes; this also reduces the incremental growth of branches over time. SVN natively preserves the executable flag, providing a minor convenience for large binary codebases.
Licensed under Apache 2.0
Mercurial is a distributed CVS supporting local copies of a repository and merging of changes into a central version. The command syntax for working with Mercurial is designed to be very simple and easy for users to quickly understand. Changes are tracked linearly over time in a data structure called a revision log (RevlogNG).
The revision log structure allows Mercurial users to interact with repository contents at arbitrary points in time ad hoc. This approach to change tracking is especially effective for projects that are very large – in terms of the number of files and number of contributors. This is reflected by the projects that choose Mercurial as their VCS such as Mozilla and NGINX.
Licensed under GPL v2+
Related: Mercurial vs Git
Bazaar has focused on developer experience as a primary objective and pursued it by borrowing features from other VCS such as Git and SVN to produce a hybrid model of distributed and centralized strategies combined. Changes are tracked in a blended model where developers can use a remote shared repository for storage efficiency and electively pull contents to a local system for work, all encapsulated by a granular permissions model.
Bazaar’s features and philosophy make it an attractive option for many projects, but especially organizations who are already working in the Canonical ecosystem – such as those using the Ubuntu operating system – may find this offering from the same folks particularly compelling. Although supported by Canonical, Bazaar is fully cross-platform and aims to bring a “it just works” mentality to source code management.
Licensed under GPL v2+
Version Control Solutions
Now that we’re familiar with some of the top VCS technologies, let’s have a look at products that make use of them:
Source Control options: Git
GitLab focuses on the lifecycle of DevOps work and provides tools to increase visibility and capabilities to developers across the phases of DevOps. Under the hood is hosting for Git repositories and related actions such as branch and release management.
Involved in a long-running arms race with its largest competitor GitHub, GitLab has innovated a wide variety of features and functionality that are now considered standard table stakes for a Git provider. From code scanning for security to automated actions to documentation, the GitLab team has enjoyed an early mover status for almost a decade. Even for teams who select a different vendor, keeping an eye on what GitLab is working on can more often than not give an insight into the future of the industry.
GitLab Self-Managed offers teams the option to fully control their implementation while retaining features and capabilities from the Cloud.
GitLab offers both free and paid tiers by CI/CD minutes as well as small business and enterprise subscriptions: https://about.gitlab.com/pricing/
Related: What is GitLab?
Source Control options: Git
Purchased by Microsoft in 2018, GitHub is a major player among open source projects with over 40 million users in 2020. Supports its core mission of providing Git repositories, GitHub includes numerous tools for managing the software development lifecycle and automation as a feature for many of them.
GitHub is a household name among developers and its octocat mascot is practically synonymous with Git itself – no list of VCS tools would be complete without GitHub. The GitHub ecosystem includes tools from managing vulnerabilities and updates such as Dependabot to commit signatures for supply chain confidence to change controls such as required reviews on pull requests to enforce adherence to standards.
Although most commonly associated with its hosted offerings, GitHub Enterprise can be self-hosted by organizations with a need to remain on-premise.
GitHub offers three tiers of pricing with a broad set of capabilities assigned to each: https://github.com/pricing
Source Control options: Git, Mercurial
BitBucket brings the familiar structure of Git into the Atlassian ecosystem with integrations to other tools such as Jira for project management and visibility. Automated serverless CI/CD activities are supported by BitBucket Pipelines and BitBucket Deployments with support for various update channels including Jira and Slack.
Although the core Git features of BitBucket largely overlap with GitLab and GitHub, its tight integration with the rich Atlassian ecosystem make it worth a long look for teams pursuing Agile, Kanban, or another of the host of similar strategies. Tracking code development and deployment fully through its lifecycle from conception in Confluence design documents to Jira stories for implementation to Statuspage for monitoring and alerting, BitBucket is the crossroads of software development.
BitBucket Server (previously called Stash) can be licensed to support on-premise hosting of both Git repository functionality and the web interface with which users will be familiar.
BitBucket provides pricing by the number of users, storage capacity, and build minutes into three tiers: https://www.atlassian.com/software/bitbucket/pricing
Source Control options: Git, Subversion, Mercurial (link)
Not only does Sourceforge deliver on the concept of hosting VCS repositories in a Software as a Service (SaaS) model, the site expands on the idea by supporting multiple VCS backends using Apache Allura and wrapping the repositories in a community of largely open source developers. Sourceforge includes wikis, databases, support forums, reviews, and other community-focused features in addition to traditional repositories to drive greater collaboration in software projects.
The Sourceforge model is an important example of iterating on the base paradigm of repository hosting. Communities certainly exist around projects that use other hosting – such as GitHub and GitLab – but a community-first view of source code management can open a unique and valuable perspective, especially for organizations managing very large software projects.
Sourceforge pricing is offered in tiers based largely on community-focused features such as promotions, reviews, whitepapers, and insights: https://sourceforge.net/software/vendors/pricing/
Cloud VCS Solutions
Source Control options: Git
Each of the big 3 public Cloud providers support hosting Git repositories in their respective Cloud offerings. Mechanically these solutions are very similar to GitHub and GitLab offerings including permissions management, branching strategies, revision history, and more.
Are these the same as GitHub and GitLab? Not exactly. Although some Cloud services can be tied into repository actions on external providers, the deep integration of services to repositories within the respective Cloud solutions make a compelling case for their consideration for Cloud-first organizations. In addition to a service-focused approach, an added layer of Identity Access Management (IAM) integration provides a valuable enhancement to the traditional hosted VCS model.
Pricing tends to be tied to storage and / or the number of users but may be discounted based on broader Cloud pricing negotiations per vendor:
Version Control Systems – Summary
While many options for Version Control Systems exist in the marketplace today, it’s clear that distributed models are the frontrunner in popularity. The distributed VCS paradigm facilitates flexibility for DevOps teams and better supports the growing remote work culture by offering developers options to work offline as well as support distributed collaboration between developers when direct communication may be difficult.
Just as each organization must analyze the benefits of centralized versus distributed solutions and arrive at the best fit, so too must organizations evaluate hosting models to ensure the right architecture; weighing the cost of supporting an on-premise option versus the convenience and loss of autonomy from a hosted product.