On CNET: Sony still losing on every PS3 it sells
BNET Business Network:
BNET
TechRepublic
ZDNet

May 4th, 2007

Comprehensive RAID performance report

Posted by George Ou @ 5:30 am

Categories: Desktop, Hardware, Infrastructure, Intel, Microsoft, Networking, Processors, Servers, Storage, Vista

Tags: Performance, Controller, I/O, Storage, Server, RAID, Intel Corp., IntroductionStorage, George Ou

Why RAID10 (0+1) superiority is a myth

As much as we love raw sequential throughput, it’s almost worthless for most database applications and mail servers. Those applications have to deal with hundreds or even thousands of tiny requests per second that are accessed at random on the hard drives. The drives literally spend more than 95% of the time seeking data rather than actually reading or writing data, which means that even infinite sequential throughput would solve only 5% of the performance problem at best. The use of extent-level striping in MS SQL Server allows even distribution of data and workload. The kind of storage performance that matters most for these applications is I/O (Input/Output) transactions per second, and it heavily favors read performance over write at a typical ratio of 80:20.

The widely accepted assumption in the storage world has been that RAID10 (or 0+1) is the undisputed king of the hill when it comes to I/O performance (barring RAID0 write I/O performance because of unreliability in RAID0), and anyone questioning that assumption is considered almost a heretic within many IT circles.  This is all based on the assumption that applications are incapable of using more than one storage volume at a time or that it shouldn’t be done.

In my last career before I became an IT blogger last year with ZDNet and TechRepublic, I was an IT consultant, and storage engineering was part of my job.  I worked with a Fortune 100 company that used SAP with Microsoft SQL Server 2000 on the backend. The SQL transaction times were getting so slow that they even considered building a whole new database server. I looked at the performance data and saw that the CPU never went above 10% utilization, and memory was nowhere near capacity. The choke point was the storage subsystem, which is almost always the culprit in database applications.

The storage subsystem was a high-performance SAN using a 20-drive RAID10 array comprising 10K RPM native fiber channel hard drives on a 1-gigabit FC (fiber channel) SAN. The knee-jerk assumption was that an upgrade to 2-gigabit would solve the performance problem, but I offered a non-conventional solution.  The storage industry now pushes 4-gigabit FC SAN because that’s the easy number to market. I knew that even during peak loads during the day, the raw throughput on the database server never exceeded 200 mbps, let alone one gigabit. The problem was the use of RAID10. I suggested using independent sets of RAID1, which was hard for the team to swallow, and it took some time for me to convince them to try it. It went against all conventional wisdom, but I was lucky to have a boss who trusted me enough to try it, and it was my neck on the line.

I replaced the massive 20-drive 10K RAID10 array with 8 pairs of RAID1 consisting of 16 15K RPM drives. The new 15K RPM drives had roughly a 10% I/O performance advantage over the 10K RPM drives they were replacing, but there were 20% fewer of the newer drives — which meant that drive speed was more or less a wash. The biggest difference would be the RAID configuration. Microsoft SQL Server fortunately permits seamless distribution of its data tables among any number of volumes you can throw at it. The use of row-level extent-level striping in MS SQL Server allows even distribution of data and workload across any number of volumes, and I used that simple mechanism to distribute the database over the 8 pairs of RAID1.

[Update 7/28/2007 - Microsoft has corrected me that it's extent-level striping instead of row-level striping.  An extent is a 64KB block of data which is the smallest unit of data.]

As a result of this change in RAID configuration, the queue depth (the number of outstanding I/O transactions backed up due to storage congestion) dropped a whopping 97%! This in turn resulted in a massive reduction in SQL transactions from a painfully slow 600ms per transaction to 200ms per transaction. The result was so dramatic that it was hard for everyone to believe it during the first week. They had thought that perhaps there was some mistake or anomaly and that this might be due to a drop in usage, but doubts subsided as time went on and the performance kept up. Even the engineer from the storage vendor who initially doubted the effectiveness of this solution became a believer after he ran some tests to verify that the load evenly distributed across the 8 RAID1 pairs.

But even with this success under my belt, it was always an uphill battle to convince other companies and their IT staff to consider using independent RAID1 volumes over RAID10. A big part of the problem was that Oracle lacked the ability to seamlessly split a table evenly over multiple volumes. It was still possible to divide up the location of the hundreds of tables that made up a typical database, but it required manual load measurements and manual distribution, which is something that many DBAs (database administrators) refused to do. It also required annual maintenance and a redistribution of tables because workloads change over time.  Without extent-level striping, it becomes a question of whether the DBA and IT department want to deal with the additional management overhead.

For something like a Microsoft Exchange Server, you’re required to have multiple mail stores anyway, so having multiple RAID1 volumes fits naturally into an Exchange environment. Too many Exchange administrators follow the RAID10 advice, and it results in a massive waste in performance.

The other major obstacle I had to overcome was the fact that most storage consultants believe in RAID10 (or even RAID5, which is horrible on write I/O performance) because this was conventional wisdom and they weren’t in the mood to listen to my heresy. So instead of trying to argue with them, I’ll just present the following quantitative analysis comparing the various types of RAID.

<Next page - Performance comparison of various RAID type>

Pages: 1 2 3 4 5

George Ou is Technical Director of ZDNet. See his full profile and disclosure of his industry affiliations.

Related Discussions on TechRepublic

Did you know you can take part in these discussions with your ZDNet membership?

  • Talkback
  • Most Recent of 129 Talkback(s)
Agreed
If your database does support using multiple targets with row level striping then by all means use that with sets of RAID1. In reality this is RAID1+0, but your database is able to do the striping sma... (Read the rest)
Posted by: codybaker Posted on: 01/04/10 You are currently: a Guest | | Terms of Use
Here is one way to look at it..  mightofnight@... | 05/04/07
Yup, hard to go against "conventional wisdom"  georgeou | 05/04/07
RAID 10  geblack | 05/07/07
Agreed  codybaker | 01/04/10
Soo....  nucrash | 05/04/07
Of course, that's what page 1 says  georgeou | 05/04/07
Toy benchmarks...  Mad Dan | 05/04/07
That's what they all say  georgeou | 05/04/07
CPU performance due to parity calculations with HW RAID?  ye | 05/04/07
Onboard Raid lacks an XO co-processor....  JoeMama_z | 05/04/07
Then why is it considered HW RAID if the RAID...  ye | 05/04/07
depends who is describing it....  JoeMama_z | 05/04/07
This is driver-level RAID  georgeou | 05/04/07
It's technically not HW RAID  georgeou | 05/04/07
Then it's not the XOR that's driving the CPU to 35% but...  ye | 05/04/07
RAID5 high-throughput write is the hardest on the CPU  georgeou | 05/04/07
That's because of the many I/O transactions required...  ye | 05/05/07
Sorry, software RAID and no-TOE is used in Enterprise solutions  georgeou | 05/04/07
I was mainly pointing to your choice of desktop board...  JoeMama_z | 05/04/07
Again,there NO difference  georgeou | 05/04/07
HW raid and parity calculations.  geoffr@... | 05/04/07
That's no longer true  georgeou | 05/04/07
Why didn't you try  JoeMama_z | 05/04/07
RAID10 is RAID1+0  georgeou | 05/04/07
Any news on those Linux drivers?  odubtaig | 05/04/07
There are ICH7R drivers but not sure about ICH8R for Linux  georgeou | 05/04/07
Think I've got it this time happy  odubtaig | 05/06/07
Testing is completely invalid  cornpie | 05/04/07
You drank the "enterprise" kool-aid  georgeou | 05/04/07
Testing still invalid  cornpie | 05/04/07
Yeah, and what is a "real" array?  georgeou | 05/04/07
First sign of a "real" RAID server is  Sxooter_z | 05/08/07
"Real" RAID  AgentDuke | 05/09/07
I've tests plenty of "real" servers  georgeou | 05/09/07
What work?  gordon@... | 05/07/07
Do you always ignore multimedia?  CobraA1 | 05/04/07
FYI, look at page 1  georgeou | 05/04/07
Pretty small side note  CobraA1 | 05/04/07
I think offloaders are dying  georgeou | 05/04/07
Many servers...  Justin James | 05/04/07
I think SuperMicro uses Marvell software RAID  georgeou | 05/04/07
Oops, confused data allocation randomisation with 'striping'  odubtaig | 05/04/07
Clicked the wrong reply button too *thwonk* NT  odubtaig | 05/04/07
It's completely different  georgeou | 05/04/07
Fixed post here  georgeou | 05/04/07
Message has been deleted.  nomorems | 05/04/07
You keep this offensive spam up and we'll have to ban you  georgeou | 05/04/07
I'd like this moron banned  Xwindowsjunkie | 05/05/07
He's getting real close if he keeps spamming the same message  georgeou | 05/05/07
HOW'S THIS FOR REAL-WORLD IT  solar_satellite | 05/05/07
Pardon? We don't require you install anything.  georgeou | 05/05/07
And this has what to do with the RAID article?  DrMicro | 05/06/07
Good Story George  Xwindowsjunkie | 05/05/07
Thanks, even high-performance RAID controllers are slow on RAID5  georgeou | 05/05/07
I'll be changing my RAID  DrMicro | 05/06/07
ICH7R is almost as good as ICH8R  georgeou | 05/06/07
ICH7R is almost as good as ICH8R  supremelaw | 05/06/07
Raid 5 capacity upgrade ?  shakewell | 05/06/07
You can, but only up to 4 drives.  georgeou | 05/06/07
Why the numbers could be suspect.  civikminded | 05/06/07
That wasn't the reason for that Fortune 100  georgeou | 05/06/07
Terminology  civikminded | 05/06/07
Outstanding IOs never hit that high  georgeou | 05/06/07
Where x86 cannot replace hardware offload  Liam Newcombe | 05/07/07
Wake up folks - SSD is here  Gene(ius):) | 05/07/07
Cost prohibitive  geblack | 05/07/07
Not so.  gordon@... | 05/07/07
SSDs have merit, but they have issues like number of rewrites  georgeou | 05/07/07
row-level striping  geblack | 05/07/07
It's also known as Horizontal Data Partitioning  odubtaig | 05/07/07
Data Partitioning  geblack | 05/07/07
Sql 2000  geblack | 05/07/07
Yes, 2 years ago, it was SQL2K  georgeou | 05/07/07
Files/filegroups  geblack | 05/07/07
No no, that's not it  georgeou | 05/07/07
Not files/filegroups  geblack | 05/07/07
There are no commands, it's built in to the management UI.  georgeou | 05/07/07
According to George  geblack | 05/08/07
Lose the 3D graphs  ejb78923 | 05/07/07
Why?  georgeou | 05/07/07
Server RAID  billyato@... | 05/07/07
Write intensive  geblack | 05/07/07
Depends on writebackcaching or writethroughcaching  georgeou | 05/07/07
Finishing post here  georgeou | 05/07/07
SAS RAID5 performance  billyato@... | 05/07/07
There's throughput, and then there's I/O performance  georgeou | 05/07/07
SAS RAID 5 is different  billyato@... | 05/07/07
No no no, you totally missed it  georgeou | 05/07/07
Nice one, George!  yanipen@... | 05/07/07
Thanks, it will be posted on TR as a downloadable PDF  georgeou | 05/07/07
Different Controllers have Different Strengths  Sxooter_z | 05/08/07
Thank you  geblack | 05/08/07
Again, these tests show IOMeter results  georgeou | 05/09/07
A Few Fundamental Things  DoubleJava | 05/09/07
Nonsense, I'd just use even more RAID1 pairs  georgeou | 05/09/07
logs  rdupuy11 | 04/23/09
Parallel to your MS-SQL story  Sxooter_z | 05/09/07
That's called short stroking  georgeou | 05/09/07
But that's not what happened  Sxooter_z | 06/07/07
"Real" RAID  AgentDuke | 05/09/07
File system alignment  dj_meier | 05/09/07
IO was much smaller than stripe size  georgeou | 05/09/07
Alignment  dj_meier | 05/10/07
My own test results  dj_meier | 05/09/07
29% better on random read IO is not "slightly better"  georgeou | 05/09/07
Correcting myself  georgeou | 05/10/07
I agree with the 29% number  dj_meier | 05/10/07
Apples to apples?  pwinn | 06/08/07
Soft partitions slow you down even more  georgeou | 06/15/07
If I read this right  kmatzen@... | 05/09/07
Almost  georgeou | 05/09/07
Not enough spindles  Dave P. | 05/12/07
Why?  dj_meier | 05/16/07
Good question, but I have to be honest, I can't explain  georgeou | 05/28/07
Any headway...  MrE. | 05/29/07
You merely assign multiple physical volumes to a database  georgeou | 06/15/07
How about a straight answer...  dfgjohnson | 07/08/07
What about using LVM across mult Raid1's  srj@... | 06/21/07
Storage multi-threading...  evodico | 07/05/07
article application to desktop set-up and pair raid1 set-up  p_byford@... | 02/04/08
RE: Comprehensive RAID performance report  nat@... | 02/20/08
Seamless data table distribution???  chamblee | 06/05/08
RE: Comprehensive RAID performance report  dontek | 10/23/08
Nevermind, I answered this for myself 5 minutes after posting...  dontek | 10/23/08
Another Question on Splitting Databases  dontek | 10/23/08
A small insight (I hope)  klauss4 | 02/27/09
Excellent still in 2009!!!  lionel2 | 03/20/09
Oracle comment is wrong  rdupuy11 | 04/23/09
p.s.  rdupuy11 | 04/23/09

What do you think?

SponsoredWhite Papers, Webcasts, and Downloads

Click Here
advertisement

Recent Entries

Top Rated

    advertisement
    Click Here

    Archives

    ZDNet Blogs

    White Papers, Webcasts, and Downloads

    SmartPlanet

    Click Here