If you’ve decided to protect your data and recover your applications, one of the first steps is making sure you can properly replicate your data to your recovery site.
There are many options to choose from including continuous data replication and batched data replication. If you’re working with a third party vendor, your Disaster Recovery as a Service (DRaaS) or Disaster Recovery (DR) provider may have suggestions or provide a replication tool as part of their packaged solution. If you’re running your own recovery solution you may need to make that decision on your own. Regardless of what replication technology you choose, here are some basic best practices to keep in mind when it comes to selecting replication technology for disaster recovery.
1. Classify Your Applications
In order to assess your disaster recovery solution, you will need to classify your applications by revenue-impact and data sensitivity.
“Classifying your apps for replication will lead you to the right form of replication for that particular app,” explains Product Solution Director Ben Miller. “One size does not fit all when you’re looking at price, technology or complexity.”
Miller goes on to state, “We have a client who uses four tiers of data recovery they have defined internally. They range from local snapshots only to 30-minute RPO with an RTO of running in the cloud in under four hours.”
By sorting your applications into those buckets the best price, protection and reliability for each tier will be easier to find. This ensures you don’t overpay for, or under-protect, an application.
2. Know Your Replication Objectives
In order to be successful, you will need to determine what your replication objectives are.
“You may need to recover the entire application, parts of an application, just the data or your entire datacenter,” explains Miller. “The solution will differ in technology, complexity and cost, based on your objective.”
Miller continues, “Recovering parts of an application to a backup datacenter while maintaining the whole application will be more complex than recovering the entire application as a whole.”
This means you may need more than one replication and DR solution. Many companies have found that based on how their applications are tiered, some tiers require different types of replication. A mix of application-level replication and hypervisor-based replication may be critical to meet your needs, recovery plan objectives and budget.
3. Choose the Ideal Type of Replication
What’s the ideal level of replication for your application? There are four options available to choose from when it comes to the level of replication: application-level, Guest OS-level, VM/Hypervisor-level or SAN/LUN-level. Ideally you will choose one level of replication for all applications in the same tier. You can overcomplicate your solution by mixing levels unnecessarily.
- Application-level replication – Application-level replication offers the benefits of low RTO and RPOs, but it requires you to maintain the OS and patching to ensure it works properly at the time of failover. SQL databases are an ideal fit for this replication.
- Guest OS-level replication – Guest OS level replication replicates data on a “block-level basis” to a target machine. This solution offers one-click failover, but the requires license costs and an agent on the source machine.
- SAN/LUN level replication – This solution replicates an entire SAN or LUN and all the VMs on each. This solution will replicate both physical and virtual machines, but is not public cloud friendly and is not hardware-, SAN- or hypervisor-agnostic.
- Hypervisor-level replication – Hypervisor replication solutions can be an ideal fit for replicating to the public cloud. Because at the target site, you only pay for what you’re consuming. This solution can save money during the times you are not recovering from a disaster. It is also SAN-agnostic. However, this replication does not work well for physical machines.
4. Understand Your Rate of Change and Bandwidth Requirements
The rate of change of your applications will impact the bandwidth requirements of your replication solution. It will also impact your RPO requirements.
A high rate of change refers to data that is constantly changing. If you have a one-hour RPO and a high rate of change, you’ll likely lose a lot of transactions in that one-hour RPO. If you have a low rate of change, that RPO can be longer and save you money in the long run.
The reason replication, rate of change and bandwidth are all related is that regardless of how often you’re choosing to replicate, be it in batches or continuous, the amount of data that needs to go across the bandwidth to your target site varies based on rate of change.
Trying to send too much data across a too small bandwidth will result in a poor experience and potentially unsuccessful replication. Correctly understanding the bandwidth you need requires understanding your rate of change for each application.
5. Keep it Simple if You Want it to Work
Don’t outsmart yourself or over-design your solution. Keeping a simple, clear and concise disaster recovery plan will increase your probability of successful recovery.
“Your SQL server may be able to replicate itself, but the manual or scripts you have to run to tell it to recover may not be worth the effort during a disaster,” explains Miller. “Weigh the costs of simplicity against cost and against the success of recovery before you call your plan final.”
Learn about InterVision’s Disaster Recovery as a Service (DRaaS).
