By Alasdair Lumsden on 7 Nov 2013
Part 3: RAID Hardware & Software (including ZFS)
Welcome to Part 3 of our series of blog posts on RAID – Redundant Array of Inexpensive Disks. In this post, we’ll be talking about Hardware RAID cards, as well as Host Bus Adapters, and finally Software RAID such as via ZFS.
Drive Connectors / Protocols
There are two kinds of interfaces/protocols for connecting drives, SATA (Serial Attached ATA) and SAS (Serial Attached SCSI). SATA is the cheaper more consumer grade standard, whilst SAS is the more enterprise standard. The connectors are almost identical, and SAS controllers can also support SATA Drives, but the reverse is not true – a SATA controller card cannot accept a SAS drive (and this is enforced by the connector).
Cheaper RAID cards will utilise the host’s CPU to compute parity via the OS driver, whilst the more expensive ones will do this themselves in hardware. More expensive RAID cards may have on-board cache memory, along with a battery to preserve the contents of the cache in the event of a power failure.
Battery backed write caches can significantly improve write performance and are of great benefit to busy database servers, especially with synchronous writes; traditionally a synchronous write can’t return until it is safely on disk, and since disks are slow, this means synchronous writes can be very slow. With a battery backed cache, these writes go straight to the RAID card’s RAM to be flushed to disk later.
Different RAID cards support different RAID levels, with the cheaper cards supporting RAID 0, 1 and 10. To get RAID 5, 6, 50 and 60 typically requires high end cards.
If you wish to use software RAID explicitly, and have more harddrives than you have SATA ports on your motherboard (or if you wish to use SAS drives) you will need a PCIe Host Bus Adapter (HBA), which is like a RAID card (and many of them can perform RAID facilities such as RAID 0, 1 and 10) but is really designed just to pass the disks through to the operating system. These are typically cheaper.
Many RAID cards make passing disks straight through to the operating system without RAID quite difficult, for example requiring you to make each disk drive a single-disk RAID 0 array. This can be quite tedious with drive failures, where you may need to run a tool or reboot into the controller’s BIOS to handle the replacement. So when doing software RAID, opting for an HBA can be quite essential.
Hardware RAID Card Manufacturers
There are 4 main manufacturers of hardware RAID cards, Adaptec, Highpoint, Areca and LSI.
Your choice of RAID card will come down to looking at the features you require. One very important consideration is driver support – some of these cards are better supported than others in UNIX/Linux operating systems.
I would say that LSI cards have the best driver support across all the operating systems, especially on Solaris derived OSes. LSI make excellent Host Bus Adapters (HBAs) whilst the other manufacturers seem to focus on hardware RAID. LSI also provide frequent firmware updates and a range of utilities for updating the firmware across different operating systems. We use LSI HBA cards extensively at EveryCity, with the LSI 9207-8i PCIe card being the one we use most.
If you’re building for the home, a good tip is to remember that SAS cards support SATA, and may cost less than a SATA RAID card. If you’re on a budget, there are a range of cards available on eBay. One thing to keep in mind however is that the older 3Gbps SAS cards only support up to 2TB drives, and not larger. If you have < 2TB drives, the LSI 3081E-R cards can be had for next to nothing.
Although software RAID like-for-like can’t match the performance of an expensive hardware RAID card with a battery backed write cache, there are many reasons why you might want to consider software RAID.
Other than cost, one main reason that springs to mind is safety. Each RAID card uses its own proprietary on-disk format. If you need to transport your drives from one server to another, and that other server uses a different RAID card, you may not be able to read the drives. Moving drives between controllers can also be error prone. The controller card software/BIOS typically has few options for recovering data in the event of an incident.
With software RAID, you have no such problem. There is no RAID firmware interfering with your data, so if you have a nasty failure you have a better chance of recovering data. Plus of course moving drives between machines becomes trivial.
RAID with ZFS
Another core reason is if you’re using an advanced modern filesystem such as ZFS. If you’ve not encountered ZFS before I’d recommend giving the Wikipedia article on ZFS a read.
One of ZFS’s main features is preventing data corruption. With a traditional hardware RAID controller, if data becomes corrupt on disk (due to space radiation, manufacturing faults, or with SSDs due to cell damage), it has no clue, and just passes the data back to you. Most operating systems don’t checksum data either, so you’ll end up with corrupt filesystems and no idea why.
ZFS is different. It stores every block on disk with a checksum. If ZFS reads a block and the checksum doesn’t match, it knows its corrupt. If the data is on a traditional hardware RAID controller, there’s nothing ZFS can do to fix it – all it can do is let you know your data is bad. But if ZFS is given control over the RAID as well, then not only can ZFS detect the corruption, it can repair it, by retrying the read from the RAID parity and writing the good data back out over the bad.
Further, with traditional hardware RAID, if you replace a disk, the RAID Controller has no clue what’s empty space and what’s data, so it has to blindly copy back the whole disk. But with ZFS, it knows exactly what’s data and what isn’t, so only has to copy across data and not free space, saving time.
With disks becoming increasingly huge, and systems scaling to having hundreds of disks, these kinds of features are essential. Indeed, ZFS’s parity checksums have saved us from data corruption on countless occasions – at cloud scale, hardware fails. With ZFS, we’ve never had to tell a customer their data is toast.
Hybrid Storage Pools with ZFS
ZFS also supports a unique concept known as a Hybrid Storage Pool. With traditional RAID, you’ll use either all spinning disks, or all SSDs.
With ZFS, you can augment your spinning disk storage array by adding SSDs to boost performance. ZFS supports this in two ways – via SSD read caches, and SSD write caches.
The ZFS read cache is known as the L2ARC. The ARC is the adaptive replacement cache, an incredibly efficient algorithm that balances most frequently used with most recently used to provide extremely high hit rates. ZFS stores a primary ARC in memory, but data that falls out of the ARC gets stored on the L2ARC SSDs. If that data is needed again, instead of reading it from spinning disks, you can grab it straight from the L2ARC.
The write cache is known as the ZIL, or ZFS Intent Log. The ZIL absorbs synchronous writes, and by putting it on a low latency SSD which supports high numbers of IOPs, you can dramatically improve the synchronous write performance of your array. This is great for high performance transactional databases.
ZFS has many other incredible features, such as snapshots, clones, remote replication, in-line compression (which dramatically improves performance) – we use ZFS extensively at EveryCity and all our Cloud VMs are stored on it. If you’ve not encountered ZFS before it’s definitely worth investigating – nothing comes close to its performance, safety, scalability and feature set.
In our next post, we’ll go through the results of our extensive RAID Performance Benchmarks. Check it out!