Raid Optimization for Database Operations

Presented March 2, 2000 at the Unix Users Association of Southern California, Los Angeles Chapter by Brian Mann

Please note: These are outline notes only, and are not intended to be a stand-alone tutorial or lesson. You had to be there. If enough interest is expressed, I will consider refining and expanding this into a complete tutorial.

What is RAID

Defined by Patterson, Gibson & Katz at UCB in 1987
"A Case for Redundant Arrays of Inexpensive Disks (RAID)"
Need driven by expense & performance of SLED, failure rate of JBOD

Three main objectives

Originally 5 levels
RAID 1 - Mirroring
RAID 2 - Byte striping with ECC
RAID 3 - Byte striping with parity drive
RAID 4 - Block striping with parity drive
RAID 5 - Block striping with striped parity
Additional levels
RAID 0 - Block striping, no parity
RAID 0+1 (1+0, 10) - Hybrid of striping and mirroring
RAID 0+3 (35) - Striped RAID 3 segments

RAID 0+5 (5+0,50) - Striped RAID 5 segments

RAID 6 - Block striping with striped redundant parity
RAID 7 - Asynchronous I/O w/ embedded RTOS

Implementing RAID

Hardware vs. software
Internal vs. external
Choosing a level

Comparison of RAID Levels
Level Advantages Disadvantages
RAID 0 Fastest I/O
No overhead for parity
Simple design, easily implemented
Not really RAID
One drive failure destroys all data
Not for mission-critical deployment
RAID 1 All drives usable for data reads
Can be implemented w/ 2 drives
Greatest storage overhead - 100%
Highest cost/capacity ratio
RAID 3 High transfer rates
Degraded mode still fast
Requires spindle synchronization
Can't do overlapped I/O
RAID 4 High read transfer rates
Efficient use of capacity
Poor write rates
Parity drive can be bottleneck
RAID 5 Very high read rate
Efficient use of capacity
Slower write rates
Slow rebuild times
RAID 6 Allows failure of multiple drives
Very poor write performance
Proprietary solution, rare
RAID 7 Supposed to be fastest Proprietary, very expensive
RAID 1+0 Very high reads and writes Most expensive

Optimizing Database Performance

When using striping, match stripe size to data block size used by application. Use 2K stripe with 5-drive array to match 8K data blocks in Oracle,
4K stripe with 3-drive array (don't forget parity requirement!).
Other databases may have different requirements (Pick = 2K, others ?). Remember it's always a power of 2. Don't use 7 drives! Gives 6 data drives, not a factor of 2 for matching.

Rather than one big array, try to use multiple smaller ones.

Databases typically update indexes along with data, put them on different arrays. Also logging files, OS, swap, etc.

Don't run any other apps on database server!!! Data block sizes often don't match, will seriously degrade write performance.

Remember all tuning is compromise between cost, performance and reliability. I am assuming reliability is not negotiable, therefore tradeoff is price/performance.
Spend your cash on cache. The more cache you have, the better the performance, period! This also means use hardware RAID. More disks = more performance (keeping in mind power of 2 rule). For ultimate performance, use striping with mirrored drives (level 0+1). Most expensive per megabyte, works fastest in degraded mode, restores fastest.

Use multiple controllers or a multi-channel controller to get fastest sustained transfer rates. Multiple cards also give redundancy from card failure, if configured properly.
Case Study - CHA

Began when server upgrade didn't result in expected performance.
Developed performance tests matching profile of typical database usage. Performed reads and writes to files in same ratio as production environment.
Tested different controller cards using RAID levels 1 and 5, using identically configured machines.
Concluded DPT controller / level 1 best for us.

Test Results

Controller RAID Level Elapsed Time (min) (SAR)
%sys %wio %idle
Mylex 1 211 10 5 84 0
DPT 1 171 18 7 75 0
Mylex 5 224 8 5 86 0
DPT 5 190 13 6 81 0

SmartRAID User's Manual, by DPT