Mercurial vs Git, why are we comparing?
Version Control Systems (VCS) can play a critical role for DevOps teams getting started, working to maintain efficiency and velocity, or improve team performance. Not only does an effective VCS need to support the mechanical actions required by DevOps – tracking changes, documenting updates, merging data, and more – but it is also in use all the time by the developers and so must include features that are a good fit for the needs of the team – understandable command syntax, provide visibility into changes, and so on.
One pattern that has been widely adopted over the last 10 years is distributed VCS, where copies of a single source of truth repository are shared – or clone’d – out to disparate systems such as developer laptops then, after changes are made locally, data is merge’d back into the source of truth following a predefined pattern – such as trunk-based. Two prominent distributed VCS solutions are Mercurial and Git. Both released in 2005 under the GNU Public License (GPL), these solutions have seen widespread adoption thanks to similarities between them and healthy competition due to their differences; let’s examine Mercurial vs Git.
Mercurial is a distributed CVS supporting local copies of a repository and merging of changes into a central version.
The command syntax for working with Mercurial is designed to be very simple and easy for users to quickly understand. Many users find the syntax simpler and more intuitive, helping to improve development velocity. The basic commands include:
- hg init to create a new repository
- hg commit saves changes in the current repository
- hg log to see all changes in the repository
- hg pull brings all of the changes from another repository into the current repository
- hg push sends all changes from the current repository into another repository
- hg serve create a web server where collaborators can see the history and pull from it
- hg merge to join different lines of history
Changes are tracked linearly over time in a data structure called a revision log (RevlogNG). The revision log structure allows Mercurial users to interact with repository contents at arbitrary points in time ad hoc. This approach to change tracking is especially effective for projects that are very large – in terms of the number of files and number of contributors. This is reflected by the projects that choose Mercurial as their VCS such as Mozilla and NGINX.
Once a change is recorded in the Mercurial history, it is immutable and cannot be removed – except via a rollback which is similarly recorded as a new change. This permanent record can be very valuable in determining when, and by whom, changes were introduced into the repository.
Git is a distributed CVS supporting local copies of a repository and merging of changes into a central version (sound familiar?). Linus Torvalds created Git in 2005 to support the ongoing development of the Linux kernel by widely distributed developers.
The basic Git commands for working in a repository include:
- git init / git clone are used to create a new repository or to create a copy of an existing repository
- git add adds files and directories to be tracked by Git; new files or directories are not tracked by default
- git commit records the current status of objects in the repository into a commit
- git status prints information about the current status, including what has deviated from the last commit
- git branch is used to manage branches both locally and remotely within the repository
- git checkout switches from the current branch to another specified branch and may create a new branch if the one specified does not exist
- git merge adds the history of one branch to another branch
- git push upload all committed history to a remote copy of the repository
- git pull download a remote copy of the current branch and merge any changes to the local version
In addition to these basic commands, Git offers several commands for more specialized operations, such as:
- git config
- git diff
- git reset
- git rm
- git log
- git show
- git tag
- git remote
- git stash
Commits are the basic unit of Git operations and are tracked distinctly by a SHA-1 hash. Under the hood Git uses a series of references to keep track of commits relative to each other as well as index the contents of each commit.
Git focuses on making a commit a lightweight and cheap operation both computationally and philosophically. Because this approach can, by design, result in numerous commits – especially when merging large branches together – Git provides a capability called rebase that logically collapses all of the commits in a series together into a single commit. This is sort of like saying “imagine that all of the work spread across all of these commits had been done at once” Doing so is a form of rewriting history; it may be obvious that all of the changes were not made in a single enormous effort, but the context of each change relative to the others is collapsed and flattened.
Mercurial vs Git
Mercurial and Git are almost interchangeable in function and overall mechanics, but a few important differences stand out:
- Implementation – with its more numerous (and more specific) command structure, Git adoption can be a daunting task; especially for teams getting started on a DevOps transition or without developer leaders already familiar with Git work. Mercurial has specifically focused on a command structure that is simpler and ideally more intuitive for developers resulting in not only potentially faster adoption but anecdotally fewer mistakes during development due to errant commands.
- Branching – Mercurial creates a branch on change from the tip (Mercurial-speak for the newest branch in the central repository) which supports multiple concurrent changesets without children (heads) which can be merged into a branch or maintained in parallel to track changes side-by-side. In Git branches are intentional – i.e. not created just because something has changed – and very lightweight; working in branches is a part of many teams’ adoption of Git technology – so much so that there are popular terms for the practice referencing Git. Within a branch, Git uses a calculated SHA-1 checksum to distinguish that unique point in branch history.
- History – Git offers a few different options for “modifying” the history of a repository; commands such as rollback, cherry-pick, and rebase. Mutable history is an intentional choice designed to work in tandem with the multi-branching nature of Git repositories; for example, using rebase to collapse changes over time into a single logical point in the repository history. Mercurial does not permit such changes to the historical record by design.
- Revision Tracking – Mercurial tracks revisions sequentially by natural numbers (1, 2, 3, etc) making it more intuitive for developers to understand the order of operations of changes over time. Git computes a SHA-1 hash at the time of a commit that is a combination of the metadata and the hash of the root object; this method is out of the scope of this article but well worth a read for any teams adopting Git. The resulting SHA means that two (or more) developers viewing a commit in the stream of revisions can be guaranteed to be viewing the same status, which can be especially important for highly distributed teams.
- Rollback – As a part of managing the lifecycle of a repository, Git includes a revert command which creates a new commit in a state without any of the changes included as part of the specified historical commit. This action does not modify the history as the original, reverted commit is kept in its place, simply that the effects of that commit are undone. Mercurial offers similar functionality via its backout and revert commands, used in conjunction with a merge.
- Community – Git boasts a very active user community – 5418 users and 1940 contributors according to OpenHub – with numerous tools implementing its capabilities and being adopted by teams worldwide – resulting in 1730 open jobs for “Git” according to itjobswatch. Using the same sites, Mercurial reports 979 users and 728 contributors on OpenHub with 44 open jobs on itjobswatch.
Mercurial vs Git – Comparison Table
|Implementation||Git can take longer for teams to ramp up given its higher level of complexity in commands and repository structure||Simpler and more intuitive commands can help teams to quickly ramp up adoption|
|Branching||A branch is a pointer to a commit (SHA)||A branch is embedded in a commit; branches cannot be renamed or deleted|
|History||Mutable through rollback, cherry-pick, rebase||Immutable beyond rollback|
|Revision Tracking||Each revision unique by calculated SHA-1||Incremental, numerical index of revision (0, 1, 2, etc)|
|Rollback||Supported via revert command; arbitrary via rebase and cherry-pick||Supported via backout, revert commands|
|Community||Git reports 5418 users and 1940 contributors on OpenHub||Mercurial reports 979 users and 728 contributors on OpenHub
44 jobs mention Mercurial on itjobswatch
Conclusion – Mercurial vs Git
There are three major considerations when choosing between Mercurial vs Git:
- Ease of adoption – from command syntax and complexity to an incrementing numerical index, many developers find Mercurial easier to learn than Git when first starting out. This is less commonplace today than ten years ago when these CVS solutions were first gaining popularity because many educational curricula for developers now include exposure to versioning as a part of the software development lifecycle (SDLC).
- Change tracking – the lightweight branching structure of Git encourages many branches and, by extension, many small bodies of work. This, combined with git rebase, tends to result in more frequent commits without fear of polluting the repository history with unnecessary data points resulting in a lower probability of losing work or of getting stuck at a point in the repository history. Many teams find this approach to repository history more friendly.
- Community – Git boasts a large and active user community promoting not only more rapid iterations of capabilities and introduction of new features, but also a greater pool from which new users may draw experience and seek help.
Teams should weigh all the capabilities of each solution and evaluate them against both the requirements and desired investment of the organization before making a choice – while the right CVS can empower developers and bring out their best, the wrong CVS can become a burden and cost center to the organization.