vSAN Erasure Coding – RAID 5 and RAID 6

First when you hear the term “Erasure Coding”, confused? , Let’s clarify this. what is “Erasure Coding” , Erasure Coding is a general term that refers to *any* scheme of encoding and partitioning data into fragments in a way that allows you to recover the original data even if some fragments are missing. Any such scheme is refer to as an “erasure code”  , this clarified from VMware Blog .

RAID-5 and RAID-6 are introduced in vSAN to reduce the overhead when configuring virtual machines to tolerate failures. This feature is also termed “erasure coding”. RAID 5 or RAID 6 erasure coding is a policy attribute that you can apply to virtual machine components. They are available only for all-flash vSAN Cluster, and you cannot use it on hybrid configuration.

RAID-5/RAID-6 on vSAN

To configure RAID-5 or RAID-6 on VSAN has specific requirement on the number of hosts in vSAN Cluster. For RAID-5, a minimum of 4 hosts and for RAID-6 a minimum of 6. Data blocks are placing across the storage on each host along with a parity. Here there is no dedicated disk allocated or storing the parity, it uses distributed parity. RAID-5 and RAID-6 are fully supported with the new deduplication and compression mechanisms in vSAN .

RAID-5  – 3+1 configuration, 3 data fragments and 1 parity fragment per stripe.

RAID-6  –  4+2 configuration, 4 data fragments, 1 parity and 1 additional syndrome per stripe.

To Learn More on RAID Levels , Check  STANDARD RAID LEVELS

You can use RAID 5 or RAID 6 erasure coding to protect against data loss and increase storage efficiency. Erasure coding can provide the same level of data protection as mirroring (RAID 1), while using less storage capacity.

RAID 5 or RAID 6 erasure coding enables vSAN to tolerate the failure of up to two capacity devices in the datastore. You can configure RAID 5 on all-flash clusters with four or more fault domains. You can configure RAID 5 or RAID 6 on all-flash clusters with six or more fault domains.

RAID 5 or RAID 6 erasure coding requires less additional capacity to protect your data than RAID 1 mirroring. For example, a VM protected by a Primary level of failures to tolerate value of 1 with RAID 1 requires twice the virtual disk size, but with RAID 5 it requires 1.33 times the virtual disk size. The following table shows a general comparison between RAID 1 and RAID 5 or RAID 6.

RAID Configuration Primary level of Failures to Tolerate Data Size Capacity Required

RAID 1 (mirroring)

1

100 GB

200 GB

RAID 5 or RAID 6 (erasure coding) with four fault domains

1

100 GB

133 GB

RAID 1 (mirroring)

2

100 GB

300 GB

RAID 5 or RAID 6 (erasure coding) with six fault domains

2

100 GB

150 GB

RAID-5/6 (Erasure Coding) is configured as a storage policy rule and can be applied to individual virtual disks or an entire virtual machine. Note that the failure tolerance method in the rule set must be set to RAID5/6 (Erasure Coding).

Additionally In a vSAN stretched cluster, the Failure tolerance method of RAID-5/6 (Erasure Coding) – Capacity applies only to the Secondary level of failures to tolerate .

RAID 5 or RAID 6 Design Considerations

  • RAID 5 or RAID 6 erasure coding is available only on all-flash disk groups.
  • On-disk format version 3.0 or later is required to support RAID 5 or RAID 6.
  • You must have a valid license to enable RAID 5/6 on a cluster.
  • You can achieve additional space savings by enabling deduplication and compression on the vSAN cluster..

RAID-1 (Mirroring) vs RAID-5/6 (Erasure Coding).

RAID-1 (Mirroring) in Virtual SAN employs a 2n+1 host or fault domain algorithm, where n is the number of failures to tolerate. RAID-5/6 (Erasure Coding) in Virtual SAN employs a 3+1 or 4+2 host or fault domain requirement, depending on 1 or 2 failures to tolerate respectively. RAID-5/6 (Erasure Coding) does not support 3 failures to tolerate.

 

Erasure coding will provide capacity savings over mirroring, but  erasure coding requires additional overhead. As I mentioned above erasure coding is only supported in all-flash Virtual SAN configuration and effects to latency and IOPS are negligible due to the inherent performance of flash devices.

Overhead on Write & Rebuild  Operations

Overhead on Erasure coding  in vSAN is not similar to RAID 5/6 in traditional disk arrays. When anew data block is
written to vSAN, it is sliced up, and distributed to each of the components along with additional parity information. Writing the data in distributed manner along with the parity will consume more computing resource  and write latency also increase since whole objects will be distributed across all hosts on the vSAN Cluster .

All the data blocks need to be verified and rewritten with each new write  , also it is necessary to have a uniform distribution of data and parity  for failure toleration and rebuild process  . Writes essentially are a sequence of read and modify, along with recalculation and rewrite of parity. This write overhead occurs during normal operation, and is also present during rebuild operations. As a result, erasure coding rebuild operations will take longer, and require more resources to complete than mirroring.

RAID-5  & Raid 6  conversion to/from RAID-1

To convert from a mirroring failure tolerance method, first you have to check  vSAN cluster meets the minimum host or fault domain requirement. Online conversion process adds additional overhead of existing components when you apply the policy. Always it is recommended do a test to convert virtual machines or their objects before performing this on production , it will help you to understand the impact of process and accordingly you can plan for production .

Because RAID-5/6 (Erasure Coding) offers guaranteed capacity savings over RAID-1 (Mirroring), any workload is going to see a reduced data footprint. It is importing to consider the impact of erasure coding versus mirroring in particular to performance, and whether the space savings is worth the potential impact to performance. Also you can refer below VMware recommendations .

Recommendations

  • Applications that are particularly sensitive to higher latencies and/or a reduction in IOPS such as ERP systems and OLTP applications should be thoroughly tested prior to production implementation.
  • Generally, read performance will see less of an impact from erasure coding than writes. Virtual SAN will first try to fulfill a read request from the client cache, which resides in host memory. If the data is not available in the client cache, the capacity tier of Virtual SAN is queried. Reads that come from the Virtual SAN capacity tier will generate a slight amount of resource overhead as the data is recomposed.
  • Workloads such as backups, with many simultaneous reads, could see better read performance when erasure coding is used in conjunction with larger stripe count rule in place. This is due to additional read locations, combined with a larger overall combined read IOPS capability. Larger clusters with more hosts and more disk groups can also lessen the perceived overhead.
  •  Ways to potentially mitigate the effects of the write overhead of erasure coding could include increasing bandwidth between hosts, use of capacity devices that are faster, and using larger/more queue depths. Larger network throughput would allow more data to be moved between hosts and remove the network as a bottleneck.
  • Faster capacity devices, capable of larger write IOPS performance, would reduce the amount of time to handle writes. Additional queue depth space through the use of controllers with larger queue depths, or using multiple controllers, would reduce the likelihood of contention within a host during these operations.
  • It is also important to consider that a cluster containing only the minimum number of hosts will not allow for in-place rebuilds during the loss of a host. To support in-place rebuilds, an additional host should be added to the minimum number of hosts.
  • It is a common practice to mirror log disks and place configure data disks for RAID5 in database workloads. Because Erasure Coding is a Storage Policy, it can be independently applied to different virtual machine objects, providing simplicity & flexibility to configuring database workloads.