Disaster recovery (DR) is an organization’s ability to respond to and recover from an event that negatively affects business operations. The goal of DR methods is to enable the organization to regain use of critical systems and IT infrastructure as soon as possible after a disaster occurs. Deciding on the right disaster recovery solution is similar – you should look at multiple attributes that will best meet your needs. Depending on your application requirements, you select the right combination of RTO, RPO, TCO (total cost of ownership), and other DR attributes for each protection group.
With the latest release of VMware Cloud Disaster Recovery, we are excited to provide customers greater choice by delivering RPOs as low as 30 minutes, which enhances what customers already have with instant VM power-on and low TCO (60% lower than on-premises DR).
When looking at DR solutions, the two primary SLAs most often considered are RPO and RTO.
- RPO (Recovery Point Objective) will determine the last good data set and potential for data loss or need for other data reconstruction methods.
- RTO (Recovery Time Objective) will determine how quickly you can get business back online to an operational level.
Greater Flexibility and Choice with 30-minute RPOs
RPOs as low as 30 minutes give customers up to 48 snapshots per day. Combined with the Scale-out Cloud File System’s ability to store a deep history of snapshots, customers now have greater choice in the frequency of snapshots, how many to keep, and for how long. This flexibility is important to balance DR readiness and total cost of ownership when preparing and recovering from ransomware attacks or other disaster events.
Let’s look at recent ransomware statistics to illustrate why this flexibility and choice are important. According to securityweek.com (citing FireEye’s Mandiant incident response data), the median dwell time in 2020 was 24 days for all malicious hacker attacks and 5 days if you only look at ransomware attacks. 5 days median may not sound like very long but approximately one-third of those ransomware attacks took 14+ days before it was detected and some even went undetected for 400+ days! (Note: some ransomware reveal itself after just a few days, which shortens the dwell time. But other variants try to go undetected for as long as possible so it can spread as widely as possible before revealing itself.)
So, in a good scenario, you detect the ransomware in 5 days or less. But you also need to prepare for when it has been in your environment for much longer. Hence, when we do customer DR planning sessions, we start with a default ransomware protection retention policy of:
- Keep at least 6 snapshots per day for 2 days
- Keep daily snapshots for 7 days
- Keep weekly snapshots for 4 weeks
- Keep monthly snapshots for 6 months
Retention Levels – Another Disaster Recovery SLA
With the new enhancement, you can now keep up to 48 snapshots per day for 2 days, while still retaining a deep history for longer-term protection.
VMware Cloud Disaster Recovery was designed to provide not only near-term recovery points with as low as 30-minute intervals but also flexible and longer-term data retention level capabilities.
The Protection Group (PG) policy definitions allow for several different protection schedules (up to 10) – each schedule with its own frequency and retention level. Each PG policy also supports a significant number of recovery point instances (i.e., 1,000 for 30-minute RPO). This allows for more robust and flexible protection policies for VMs in the protected site inventory.
For example, this comprehensive protection policy below would only generate a little over 700 recovery points – well within the solution limits:
- Every 30 minutes for 10 days (480 total) for near-term recovery
- Every day for 4 months (120 total) for less stringent recoveries
- Every week for 2 year (100 total) for worst case roll back scenarios
NOTE: concurrent recovery points that occur at the same time (e.g., the snapshot that happens once a day that might coincide with all 3 scheduling rules) are supported with a single instance – optimizing backup storage and reducing total recovery point counts.
The flexible Protection Group policy definitions supported in VMware Cloud Disaster Recovery can easily provide the low RPO, and RTO levels needed along with deeper retention levels to support today’s most critical disaster recovery situations.