Whether rare, major events (like a SEV1 production outage), or just everyday interruptions out of flow (like customer support tickets), engineering escalations are a huge part of the development lifecycle—and at a cost to product velocity.
From on-call papercuts to customer support debugging, engineering escalations are one if not the biggest pain points for engineering teams.
They’re a huge blocker for developers— each interruption gets in the way of building product. They slow down your business and severe incidents can damage your brand.
An extreme case
Even at the most sophisticated tech organizations, outages will happen.
In October of 2021, Facebook experienced their biggest outage to date. While trying to perform a routine networking maintenance procedure, a command was run in production which unintentionally took Facebook's service "off the Internet": all requests to Facebook URLs failed.
Facebook and its affiliated services, Instagram and WhatsApp, were all down for ~6 hours, affecting 3.5 billion users from accessing their product. In advertisement revenue alone, according to Fortune.com, it’s estimated that Facebook lost $100M.
Now while most escalations or outages won’t be the same size or magnitude of Facebook’s mishap, let’s really investigate and find out how big of an impact small bugs and customer support tickets really have on an engineering team.
Business impact for everyone
From researching many companies of different sizes, we’ve discovered that in the average workday of an engineer, 30-40% of their time is focused on writing code. Engineering escalations such as resolving customer support tickets on average take up roughly 25% of an engineer’s work week. (Smaller companies had a larger %, bigger companies smaller %.)
In a 40 hour work week, that's 10 hours taken away from working on product. Say your team has 10 engineers, in one year, assuming your team works 50 weeks out of the year, that's 5,000 hours taken away from product development. For 100 engineers, that's 50,000 hours. This is a very significant amount of time being taken away from your engineers.
Stay in flow
The term “deep work” has been popularized by Cal Newport and others in recent years. It's defined as when you're in a state of distraction-free concentration that pushes your cognitive capabilities to their limit. Newport found out that in an ideal productive day, people can only do four hours of deep work before seeing diminishing returns.
Four hours is of course the ideal limit - we all know first-hand how just one Slack message or one quick meeting can easily interrupt your train of thought, making it difficult to get back on task. A study by the University of California Irvine found that the mental cost associated with switching tasks can translate to up to 40% more time to complete the tasks. And that each interruption a worker experiences, they lose 23 to 25 minutes of work time.
With this in mind, in reality, engineers are probably spending a lot less time than 30% of their work day focusing on building your product due to things out of their control.
Even though your organization’s top priority is the time to market for a specific product, you of course can’t just ignore customer bugs and escalations, and you certainly can’t ignore downtime.
But you can make all this better.
Best practices with Cased
Cased’s modern engineering enablement platform helps empower engineering teams to easily and quickly resolve escalations, so they can go back to building product.
To help shorten the amount of time it takes to resolve escalations, Cased auto-discovers the location of your remote hosts, no more need to manage inventory lists.
Additionally, Cased has approval workflows for SSH access to securely give access to your developers via ephemeral SSH certificates. No more waiting on IT or DevOps to provide you a SSH key. With one-click access to any prompt, your developers can save valuable time during incidents to get to the shell they need 10x faster.
By using Cased, you can create a standardized knowledge base with runbooks, snippets, and session recordings to save even more time during incident response.
Engineers can create searchable, reusable lines of code right inside a production shell with snippets. With Runbooks, you can create standardized, codified processes, to automate common production workflows. Find the solution you need in moments.
For common incidents, with every production session being logged and audited, developers can share past and live sessions cross-functionally to see exactly what was done in order to come up with a solution to solve a similar incident being handled.
In addition to solving escalations quicker, Cased can also help prevent these escalations from slowing you down. Cased Runbooks allow you to automate the complex tasks that engineering, operations, and support do everyday. By removing the chance of operator error, your organization can run these commands and queries safely and securely without direct production access. This means that non-engineering teams such as support or product can run these processes safely without the risk of any downtime occurring.
By using Cased, you can save time and money by enabling your developers with easy access and tooling to solve escalations quicker.