Top DevOps metrics to measure DevOps success

Intro

The DevOps mindset has taken hold of many aspects of the software delivery process. So much so that it is worth gaining additional insight on the day-to-day operations that can show success markers and areas for improvement. Rather than focus on daily reports that do little more than show “normalcy,” today’s DevOps metrics focus on measurable data that is (or should be) collected throughout your company’s DevOps implementation.

Information and the tools that collect those metrics include anything from simple progress indicators like number of deployments, to a combined view of data that supports justifying additional automation. Looking at DevOps through this expanded lens is the goal of those interested in measuring performance for DevOps metrics. We will take a deeper look into the metrics and tools that help provide that data.

What does a successful DevOps implementation look like?

It is important to determine what benchmarks and other factors are involved in demonstrating the success of a DevOps implementation. Beyond the measurable metrics we will be discussing, successful DevOps depends on factors specific to the products and services a company is providing.

Overall, the main goal of a DevOps implementation is to automate software delivery in a way that provides less human intervention. Along the way, quality is improved through various means. Pre-commit git hooks are used to check with a specific set of policies to help prevent code from even reaching the codebase. Automated QA methods allow for extensive API and UI testing before code is merged in.

So, does that mean if you’re using steps like these in your pipelines that you have a successful DevOps implementation? Hardly. These are just small parts of the larger picture that DevOps is meant to help with, especially with the additional automation needs for today’s software delivery. Measuring success via a set of DevOps metrics helps look past the small victories to show the overall benefit of the mindset. How else can we determine whether the investment in additional DevOps automation is worthwhile and how it can be continuously improved?

Constant assessment and improvement are key

Once you have the initial DevOps processes implemented, their performance needs to be brought to the attention of a wider audience. Work done in silos has never proven to benefit a software team and it’s critical to bring DevOps processes to a point where they can be viewed and continually improved.

What is the best way to go about this in a service-type setting such as DevOps? There are some factors that may be more measurable than others. If your group handles tickets like a service desk, you may have things like SLA to keep track of. Along those lines, the number of support incidents generated after a change in automation may be important.

Furthering analysis of these in addition to other DevOps metrics is where a more appropriate picture can be obtained, showing more than just vanity numbers. While it may be interesting to know how many successful builds were completed, that number would mean more when compared to other metrics that dig further into the downstream effects of those builds, including releases tied to sprints or similar software delivery schedules.

Benefits of data-driven DevOps metrics

It all comes down to seeing how your DevOps truly perform when it comes to implementation and some industry standards. This goes beyond the vanity metrics and deep into the core of what we hope DevOps can do: Provide a method to deliver software and services through automation in a way that is repeatable, consistent, and without error.

From initial inception to delivery, and eventually to support incident metrics related to the release – this is all important information to show the benefit of today’s DevOps. While there isn’t necessarily one product or service that provides a quick culmination of these DevOps metrics, there are methods that the industry is moving towards that will help guide you in what may work best.

It may also be possible to collect too much or the wrong type of data. Going back to our example of vanity metrics like “number of successful builds,” it may be tempting for teams to start looking at small pieces of data as their own performance indicators. A particular group may see that adding more lines of code is an indication of success, while the actual value may not shed any light on success or failure at the time of delivery.

By keeping data in mind at the start of an automation effort, some of the more common data elements that may provide deep insights can be identified early on. Determine what is available for collection and take note of how that data can be processed and used as a Key Performance Indicator (KPI). KPIs are measurable values that can show progress towards a key business objective. Common metrics like “lead time” can be derived from the tasking and CI/CD information for a particular objective.

These DevOps metrics should adhere to industry recognized properties and standards that represent useful data. You may have been exposed to the acronym SMART: Specific, Measurable, Achievable, Relevant, and Time-bound. The following are good examples of properties that adhere to the SMART methodology in the context of the software development life-cycle:

Metrics should be measurable in that they have an actual value that can be averaged (numbers) or counted (e.g. pass/fail). Relations between specific measures can be expressed in percentages (e.g. number of defects opened out of total numbers of tests implemented/total number of test lines).

Relevancy is important and often dependent on the business or service. That’s naturally obvious, because there’s no point in measuring and reacting to things that are not related to actual business results (however, understanding what is related to actual business results is not that obvious).

Resilience against falsifying or inflating results by team members. Measures shall be resilient in how they are being recorded. For example, “number of defects opened” is a measure that can be easily “messed up” by opening false defects and then closing them (e.g., by a testing engineer who wants to present high activity levels). We will discuss ways to deal with problems as that, but for this example we can decide to ignore defects that were closed with status “not an issue” or “not reproducible”.

Identify actions that will help improve the overall process by including workflows for automation and implementation of additional policies. This relates to relevancy: you wouldn’t want to put effort into measures for which you cannot control any improvement, as you cannot identify the actions that affect them.

Being traceable to the source is a critical property of DevOps metrics. Rather than just giving an indication of a failure, a path to find where the failure takes place or what data element is responsible is a “must have”, otherwise, again, improvement is not possible.

DevOps Research and Assessment (DORA) helped identify key metrics as important data points that are considered targeted and easy to implement in most situations. There are four such measurements that are often used as KPIs to help capture overall delivery performance:

Deployment Frequency

One of the first tenets of DevOps automation is to deploy often. This is surely a method to show the value of your automation and various integrations with anything from the code repository to the deployment target. You may not be able to tell other aspects of these deployments, for example, how many of those deployments resulted in additional failures that affect other key metrics?

The information for this metric is often derived from the software used to deploy. In most cases, the major build and deploy products on the market have built-in reporting available for such data. Using just this data point may show your DevOps has progressed to a higher level of performance. However, as with most metrics, it needs to be looked at alongside other information to ensure that deployment frequency isn’t being increased due to a desire to show progress when there isn’t one.

Azure DevOps, for example, ties more information to the release metric. This is beneficial to determine other valuable data points: work items can be tied to the software development process so that a history of the change is included along with the simple frequency of deployment information. Other Source Control/

tools also have similar features.

Lead Time

Imagine a timer starting from the moment a developer commits their changes to the moment where it is running in a production environment. This span of time is often called “lead time.” It can be a quick indicator of performance that shows how well your DevOps automation is going. Long lead times could mean there are more areas that could be improved as far as efficiency and automation opportunities.

Measuring this DevOps metric involves tying information from multiple systems to create a complete view. If a team’s work is completed in Atlassian JIRA but they release with Jenkins, there may be data that must be compiled from both products in order to show the true lead time. This hybrid use of DevOps tools is nothing new to most of us. This metric is a good indicator of how swiftly the team can respond to feedback and feature requests. While it may be hard to set a desired target for this measure, being dependent on the project, tools used and complexity of each development, the goal that we should strive for is seeing improvement over time – for instance, deployments taking minutes rather than hours is a win.

Mean Time to Recover

Failures happen. The key is to lessen the effect of those failures. One key metric is meant to show how long it typically takes to recover from a failure. Mean Time to Recover is the amount of time needed to recover from a failure. Collecting this data can show how a team is able to respond to a defect or an outage requiring a code release, reconfiguration or other troubleshooting actions. With multiple environments to consider, the information gathered may be one that deserves multiple points to be measured. For example, does the value consider time taken to recover in Development vs. Production?

By using data from service tickets for both internal and external customers, a calculation of this metric can be constructed. The ticket and associated SLA metrics are measured against other metrics such as “Lead Time” to surface the “Time to Recover.” Using statistical calculations like means and averages allow for a metric that has a targetable aspect,. one that hopes to reduce time to recovery to provide better support. The question is how quickly from the moment a major problem arises, the team is able to recognize the root cause, understand it, solve it and deploy the solution.

Change Failure Rate
This measure captures the rate of commit failures (using here the term “commit” in a very generic sense: you may implement that for code commits to the main branch, to deployments from development to integration/system testing and so on).

This metric is centered both around the quality of code being deployed towards production and the quality of your DevOps gates. At first you may believe a failure rate of zero would be the most desirable, but this is not realistic in a real-world scenario and in most cases would not point to high development quality but rather the low quality of your DevOps gates. A high change failure rate can show that your automation is working as designed, but you may have a problem with your dev, or you could be missing prior gates.

It should be emphasized that the gates shall be propagated to R&D so that they check themselves before commit. Bottom line: if there is a high change failure rate, things are not good, even if DevOps is doing their best. That’s why this is a good measure! Using release management software, this metric can be found quite easily. And if we can trace the changes back to their commits, we can then analyze specific cases, allowing us to pin-point the root cause of issues and seeing if the problems come from specific areas in dev, whether we are missing a certain prior gate, etc.

What are some other possible DevOps metrics?

While the “North Star” metrics we discussed above are good for many situations, others need to rely on different markers to indicate success, failures, or to show progression . Keeping in mind that some are more “vanity” in nature, that may be the exact metric needed for that application.

Deployment Time – This is either very important, or not as critical as it may be for other teams. There are times when a high deployment rate is necessary due to the order of operations. Other times, this metric could indicate that more resources are needed for build agents. This measure is a subset of “Lead Time,” focusing only on the deployment part, without the CI pipeline part of building and testing.

Failed Deployments – Not the same as failed releases, this metric is about how many deployments simply errored out when they were attempted. This is usually an indication of an unstable release environment, but could also be configuration errors in the way builds and releases are being executed, maybe missing a gate before deployment or having a loose part in the process. This may be especially true in areas where developers are able to control the build process using YAML or other similar definitions.

Code Commits – A simple metric that can be easily retrieved from your source control tool. What is considered good, average, or low, is more project-specific and should be interpreted on a case-by-case basis. A high number of commits could show good practices for keeping the feature branches up to date. A low number could mean that less work is being done than expected, or it may be possible developers are working on local branches that prevent them from committing as often. Neither is beneficial!

Injected Activities – Injected or unplanned activities can have a lasting effect on a software release process. Besides lowering the amount of time dedicated to specific work items, people often like to display this metric to justify additional resources. An example of a measurable activity might be the number of support tickets destined for DevOps engineers vs. the project work they have while working directly with teams.

Application Usage Insights – How people use the software is absolutely a factor for those who release frontend applications. Today’s logging and usage tracking allows for data that shows how a feature or bug fix may affect how consumers respond. While not for every industry, it is a good indicator of how a change directly affects the main product. One such example is to look at the number of errors caused by people using the application after a release. If this value exceeds a certain threshold, that can be an indicator that the software release broke something.

As you can see, many of these DevOps metrics intertwine with one another. How they affect each other becomes apparent once many of the main measurements are in place. this means that if your Code Commit metrics show low numbers but you have many Injected Activities, you could surmise that the low amount of code being committed is due to other activities getting in the way.

Let’s test another scenario. Looking at our “North Star” metrics, Lead Time and Application Usage Insights can absolutely show trends related to one another. How? Let’s say you have longer than desired lead times for feature requests that consumers have made. The usage insights may show less and less visits to the application, or even deeper metrics related to subscriptions. These and other DevOps metrics may show losses due to long release cycles.

How are DevOps metrics used today and how should you use DevOps metrics?

It is clear that measuring is important. Without measuring it is hard to improve in a methodological way. When it comes to improvement processes, intuition is not enough in the long run, especially when your organization and code base is growing, even beyond the single team level. However, you should be cautious when implementing and acting on measure alone without understanding their actual meaning; this may encourage wrong and biased behaviors such as unnecessary frequent deployments to show that things progress well even if they are not. This is true for almost any measurement and is the reason for looking at combined measures, which help see the real picture and not just an obstructed view of it.

Interestingly, some aspects of using metrics to measure DevOps success have drifted to the sidelines. Information on the subject is available, but there isn’t much in the way of discourse in the DevOps ring. Whether that is due to a need to prioritize other initiatives, or a lack of success in getting everyone on the same page, is hard to say. What most of us do find important is to continuously improve our automation processes. This is not to say that DevOps metrics are not in use – they are! – but maybe not to the extent you would expect it to be used, especially when compared to more specific development measures (such as “Code Coverage,” which is quite specific for testing and being highly in use – looking at a narrow artifact, even if important – while the big picture is much more interesting and sometimes overlooked).

It is worth noting that different organizations and products would look at different measures and strive for different actual value for each measure, which is natural. However, any organization that strives for improvement should set goals based on its own measurements.

Looking farther into the future, additional metrics will be available from AI systems. Taking a look into the future of DevOps, there will likely be more focus on automation, Infrastructure as Code, and the sheer amount of data that’s becoming available. Creative and useful ways to look at the metrics from that data will become more and more important to show results.

Amir Kirsh

Amir Kirsh, Incredibuild's Dev Advocate, is a C++ lecturer at the Academic College of Tel-Aviv-Yaffo and at Tel-Aviv University, previously the Chief Programmer at Comverse, after being CTO and VP R&D at a startup acquired by Comverse. He is also a co-organizer of the annual Core C++ conference and a member of the ISO C++ Israeli National Body.

Cookie	Duration	Description
ARRAffinity	session	ARRAffinity cookie is set by Azure app service, and allows the service to choose the right instance established by a user to deliver subsequent requests made by that user.
ARRAffinitySameSite	session	This cookie is set by Windows Azure cloud, and is used for load balancing to make sure the visitor page requests are routed to the same server in any browsing session.
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
_gat	1 minute	This cookie is installed by Google Universal Analytics to restrain request rate and thus limit the collection of data on high traffic sites.
_uetsid	1 day	Bing Ads sets this cookie to engage with a user that has previously visited the website.
_uetvid	1 year 24 days	Bing Ads sets this cookie to engage with a user that has previously visited the website.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_UA-8508435-1	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
_gcl_au	3 months	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_hjAbsoluteSessionInProgress	30 minutes	Hotjar sets this cookie to detect the first pageview session of a user. This is a True/False flag set by the cookie.
_hjFirstSeen	30 minutes	Hotjar sets this cookie to identify a new user’s first session. It stores a true/false value, indicating whether it was the first time Hotjar saw this user.
_hjIncludedInPageviewSample	2 minutes	Hotjar sets this cookie to know whether a user is included in the data sampling defined by the site's pageview limit.
_hjIncludedInSessionSample	2 minutes	Hotjar sets this cookie to know whether a user is included in the data sampling defined by the site's daily session limit.
_hjTLDTest	session	To determine the most generic cookie path that has to be used instead of the page hostname, Hotjar sets the _hjTLDTest cookie to store different URL substring alternatives until it fails.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
MR	7 days	This cookie, set by Bing, is used to collect user information for analytics purposes.
utm_campaign	2 months	Google Ad Services sets this cookie to store session campaign value if present.
utm_content	2 months	This cookie is used for storing the session content value if present.
utm_source	2 months	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.
utm_term	2 months	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
_mkto_trk	2 years	This cookie, provided by Marketo, has information (such as a unique user ID) that is used to track the user's site usage. The cookies set by Marketo are readable only by Marketo.
fr	3 months	Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
MUID	1 year 24 days	Bing sets this cookie to recognize unique web browsers visiting Microsoft sites. This cookie is used for advertising, site analytics, and other operations.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
utm_medium	2 months	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Top DevOps metrics to measure DevOps success

Intro

What does a successful DevOps implementation look like?

Constant assessment and improvement are key

Benefits of data-driven DevOps metrics

What are some other possible DevOps metrics?

Amir Kirsh

Table of Contents

Shorten your builds

Related Posts

15 minutes An Internal Development Platform: What it is, and how it helps your devs and boosts your business

15 minutes Measuring the Value of Development Acceleration

15 minutes Platform Engineering vs DevOps: A Comprehensive Comparison

Cookie	Duration	Description
_hjSession_2537450	30 minutes	No description
_hjSessionUser_2537450	1 year	No description
AnalyticsSyncHistory	1 month	No description
BIGipServersn-mch-v2-80	session	No description
BIGipServersn02web-nginx-app_https	session	No description
ib_last_referrer	2 months	No description
incap_ses_1319_2167377	session	No description
li_gc	2 years	No description
muc_ads	2 years	No description
nlbi_2167377	session	No description
original_req_url	past	No description
referrer66_00f	1 month	No description
visid_incap_2167377	1 year	No description
visitorId	1 year	No description