A few points to consider when making technology choices for your infrastructure.
keywords: computing for finance, ubuntu, RAID, windows, ZFS, RAID1, RAID0, RAID10, RAID5, RAIDZ
1. There is a 100% chance your RAID volume will fail
This post is allegedly about surprising facts. But this fact should not surprise you. Still, some folks invest in RAID with the expectation that it removes the burden of backing up their files. Don’t make that mistake.
Instead, you should think about RAID as the first tier in your data protection strategy. When you consider which RAID configuration you might use, think about the coming day when the entire volume will be corrupted. I’m not talking about just one disk. I mean the whole shebang.
What’s your data recovery plan? How frequently are you willing to put up with the hassle to restore your data? Some RAID implementations are aimed at making failure an infrequent event. Other configurations significantly increase the chance of failure.
Make sure you have a backup plan, then consider your RAID options in that context.
Fact 1 may not have been so surprising. Here are a few more that might actually surprise you:
2. Software RAID is almost always a better choice than hardware RAID
Software RAID has advanced significantly in the last few years (as of 2012). Hardware RAID still has the three key vulnerabilities it has always had: First, it is expensive. Second, if your RAID card fails, your RAID volume fails; it is a single point of failure. Third, if your RAID card fails, you must find an exact replacement for that card to recover your data.
On the other hand, software RAID costs nothing, and if your controller card or motherboard fail, you can just move your disks to another machine and set up the appropriate software to read them.
Yes, hardware RAID can be faster than software RAID, but that gap is closing, and the flexibility and reliabilty offered by software RAID outweighs that single advantage. The only case where hardware RAID is the right choice is when absolute speed is the only priority, and you’re willing to take risks with your data.
There are some articles on the web that compare hardware versus software RAID. They are good reading, but in some cases the information they contain is old. You should be sure to make your decisions on the state of the art. As of 2012 there are a number of new capabilities offered by software RAID that make it worth considering:
- Hot swapping works with software RAID. SATA 3G and SATA 6G made that possible. If a disk goes bad, swap it out, no down time.
- Software RAID only consumes a small slice of CPU cycles. In my tests with mdadm on ubuntu I saw only 2% to 4% of one CPU dedicated to RAID. On a multi-core machine this is nothing.
- Software RAID works with SSD caching. The most used data migrates to a very fast cache.
- Software RAID supports variable size volumes that can be extended by adding more disks (specifically ZFS supports this, maybe others do as well).
3. Some “RAID cards” aren’t hardware RAID
Over the last few years SATA disk controller cards and motherboards have come out that claim to offer hardware RAID. They are really just disk controllers with BIOS that implements RAID in software.
How can you detect these cards and motherboards? Usually price is the giveaway. A $20.00 card is not likely to implement true hardware RAID. Also these cards usually offer windows-only support. Here’s a good writeup.
4. On-disk data compression can make your RAID volume faster
That seems counter-intuitive because it takes computing power and time to compress data. Here’s why it can make your disk performance faster: The bottleneck in disk IO is bandwidth to the disk (your SATA pipe). If the data is compressed before writing, there’s less of it to write, so it moves more quickly to the disk.
ZFS offers compressed volumes. I’m sure other software RAID implementations offer it as well. Here’s some discussion of this topic.
5. Maybe you don’t need RAID at all
Consider an SSD instead. Yes, SSDs are expensive, but a single SSD is less expensive than the multiple disks you’d need to build a comparably fast RAID volume. For example, as of this writing (May 2012) an Intel 250GB SSD prices in at $350 and it’s faster than many RAID configurations built with spinning disks. See one of my other posts for details on SSD speeds compared with RAID on SATA 3G.
SSDs can also be used as cache for a RAID volume. For our next server I’m contemplating a two disk mirror for reliability, augmented with an SSD cache for speed. This can be done easily with ZFS.
And yes, for ultra crazy speed, you can build a RAID volume out of SSDs.
6. The hottest new RAID tech comes from Oracle, and it is open source!
Nick Black of sprezzatech.com pointed me towards ZFS. I’ve looked at it deeply and decided it’s the way to go for our data. ZFS’ designers prioritized reliability and scalability, and it’s got most all existing filesystems and RAID implementations beat on those points. ZFS was built by Sun Microsystems for their Solaris OS. They released it under an open source license and it is now available for Windows, Mac OS and Linux.
Oracle acquired ZFS through their acquisition of Sun. ZFS’s features are touted by Oracle for their hardware solutions. I’ll bet Oracle hates that Sun open sourced ZFS, but that’s a story for another blog.
Stay tuned for a blog from me on ZFS. For now, some rules of thumb for choosing RAID levels:
7. If speed is the only priority, choose RAID0
In RAID0 the data is split or “striped” across the multiple disks and written (or read) in parallel. With N disks, speed up is N-times for reading and N-times for writing. Here’s the downside though: Total failure of your RAID volume is N-times more likely. You should assume that RAID volume failure is an absolute certainty.
Choose RAID0 only if you can easily rebuild your RAID volume. Make sure you have a strong backup workflow.
8. If reliability is the only priority, choose RAID1
RAID1 is called “mirroring.” The data is fully duplicated on two (or more) disks. If one disk fails everything is OK; The RAID volume will continue operating, and it can be rebuilt when you replace the defective disk. RAID0 also offers N-times speed up for reads, but no speed up for writes.
9. In nearly all other cases, RAID10 is the way to go
In RAID10, pairs of disks are mirrored to create reliable volumes (RAID1), then those reliable volumes are combined via RAID0 for speed. Four disks combined in this way offer 2 times speedup over a single disk for reads and writes, yet they can also sustain loss of two disks and still operate. You get both speed and reliability.
Many folks would consider RAID5 for these applications. I think its a bad choice nowadays in comparison to RAID10 because RAID5 is subject to very slow write speeds; sometimes slower than writing to a single disk. See my tests here. Also RAID5 can only survive loss of one disk. In a 4 disk setup RAID10 can sustain two disk losses.
The main advantage of RAID5 is that it offers more total storage than RAID10. But the speed and reliability of RAID10 more than offset that advantage.
What about the other RAID levels? The three I mention above cover 99% of RAID use cases. The situations in which other RAID levels are useful are limited. If you want to dive in though, here’s a good starting point.
10. And the tenth surprising fact about RAID: There are only 9 surprising facts about RAID!
Thanks for reading :-)