Are You Thinking About RTO All Wrong?

On simplistic terms, a recovery time objective (RTO) serves as a target time by which you intend to recover your business after disruption. Following a disaster event that merits a declaration, there’s often a lot of scrambling. Having this RTO goalpost is meant to focus and prioritize efforts, especially in our modern business landscape where increasingly disaster recovery (DR) plans are targeted to failover to a cloud like AWS. However, the intent behind true RTO often gets warped, which leads to a miscommunication between the IT department and the rest of the business.

For too many, RTO means the time it takes to stand up the disrupted technology, but it doesn’t always include returning service to end users. Prior to restoring end user access to a disrupted application, the application owner needs to ensure it is ready for users. There are quality assurance checks that must be performed, and this process takes time and sometimes additional remediation. Because this generalization of the term RTO might not include the full aspects of recovery, it leads to a disconnect between “technology availability” and “service availability for use,” which in turn often leads to angry conversations between customers, IT, and executive leadership following a disaster, a time in which the disaster itself is already anxiety-inducing enough.

So, what’s the answer? As your business looks to improve its disaster recovery (DR) and backup strategy, InterVision has long been defining RTO for our clients in terms of what we call True RTO. Here’s what we mean:

Two Interpretations of RTO

  • Technical RTO: In this case, RTO means the time it takes for infrastructure to be running and available for recovery.
  • True RTO: In this case, RTO means the time it takes for an application or system to become available and begin serving end users.

Indeed, we define True RTO as the time it takes for applications or systems to become available for end users again, making it the best definition to convey to stakeholders what your business DR capabilities will be, and when they can expect a return to normalcy. As a result, True RTO helps to decrease miscommunication when an event does occur.

Sadly too many vendors add to the confusion by claiming to ensure RTOs of minutes-to-seconds, without clarifying it as Technical RTO. The problem is that it’s convenient for many technology vendors to skip over the quality assurance aspects that must occur before applications can return to normal usage again. Orchestration and automation help to reduce the gap between Technical and True RTO but the application has to be able to take advantage of these technologies. Many traditional applications require additional touches to be ready for users.

 

An additional vocabulary term InterVision uses in order to properly set expectation for clients, is Recovery Waves. Many companies’ application owners, dev teams, or QA teams are not scaled to be able to test and qualify every single application at the same time even though, technically, they can all be booted at the same time. Recovery Waves help IT teams delineate which tiers of applications and datasets, as well as which applications and datasets within each tier, must be given attention and when, in order to reach the ultimate True RTO goal. Think of Recovery Waves as tiers within each tier, so that IT teams aren’t overwhelmed during the failover process. InterVision uses Technical RTO, True RTO, and Recovery Waves to cut through market confusion and help organizations set clear and achievable recovery goals.

 

If you’re interested in learning more about DRaaS solutions that target AWS, check out our webpage on the topic. InterVision is the only DRaaS vendor verified by both Gartner and Forrester who can target DRaaS and BaaS to either hosted VMware or AWS.

Heading to AWS re:Invent Dec 2-6? We will be at Booth 1764!

X