Performance of RAID Arrays on Windows Azure: an Alternative to Horizontal Scaling
While working with several different NoSQL databases heavily loaded with write requests, we faced a situation when the hard drive became a bottleneck. Scaling the cluster horizontally could easily solve this kind of problem, but it would also increase the monthly payments. This is why we decided to take a look at other options.
The first thing that comes to mind when a DB starts experiencing HDD performance issues is to combine several virtual drives into a RAID array, but how will it work with Windows Azure virtual infrastructure? To check this, we compared the performance of a single virtual drive and different RAID arrays (types: 0, 1, 4, 5, and 6) using the Bonnie++ tool for hard drive subsystem verification.
Below you will find the test results and step-by-step instructions on how to configure a RAID array on your own.
Test 1: RAID performance under Write/Read/Re-write workloads
In the first test, we measured the performance of different RAID arrays for simple read/write operations:
sudo bonnie++ -d /raid1/ -m 'raid1' -u root -n 100:8192:16384:20 -x10 -s 16g -f > raid1.csv
Bonnie++ was run 10 times (-x10). Each test worked with 100 files of 8-16 KB in size and 20 subdirectories. In total, there were 16 GB of “files” in each iteration. Since a large Windows Azure instance has 7 GB of RAM, we had a chance to avoid caching.
You can see the first test results below. The x-axis stands for megabytes per second, the y-axis indicates repetitions (we ran each test 10 times).
Write test results:
Read test results:
Re-write test results:
Input/output operations per second:
According to these results, RAID 0 demonstrated the best performance in the write test and almost the same results in the read test. The IOPS values ranged from 350 to 450 for all RAID types.
Test 2: Performance of 2, 4, and 8 drives in a RAID array
Here, we wanted to see how the number of virtual drives affected RAID performance. For this test, we started two more large instances. One of them was used to create a level 0 RAID array of four drives and the other one for a level 0 RAID array of eight virtual drives:
- four drives:
sudo mdadm --create /dev/md0 --level=0 --raid-devices=4 /dev/sdc /dev/sdd /dev/sde /dev/sdf
- eight drives:
sudo mdadm --create /dev/md0 --level=0 --raid-devices=8 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj
We ran the same test on both VMs:
sudo bonnie++ -d /raid0/ -m 'raid0' -u root -n 100:16384:16384:20 -x10 -s 16g -f > raid0.csv
You can see the results on the diagrams below:
Test 3: Stability check
The last test was to check how stable Azure storage is. Here, we consequently tested RAID performance under a workload for 8+ hours.
Given the results of all three tests, we can make the following conclusions:
- Windows Azure virtual drives work faster when combined in a RAID array. A level 0 array is the fastest storage.
- The number of virtual drives has the biggest effect on write operations. A RAID 0 array with eight nodes works 4.5 times faster than a single virtual drive.
- When performing read operations, eight nodes work two times faster than a single drive.
- Virtual disks work faster when in use.
Red and blue lines represent the servers that had already been used in Test 1. Green and yellow lines stand for the new servers that we started for Test 2. Every time a new server is created and starts using HDD heavily, it takes ~30 minutes for it to accelerate to the maximum speed.
How to configure a RAID array
The following steps will help you to set up a RAID array for your own project:
- Attach empty drives to a VM using a CLI or a Web interface. The number of disks that can be attached depends on the VM’s size.
- Use the following commands to find the names of the disks:
- Combine them into a RAID array (all the disks from step one must be listed in this command):
- Mount the newly created RAID disk to the file system using the following commands:
sudo lshw -class disk
sudo mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/sdd /dev/sde
sudo mkfs.ext3 /dev/md0
sudo mount /dev/md0 /raid0
That’s it. Now you can start using your new disk. Still it is worth reading the “Saving your RAID configuration” section here.
More RAID performance results
We’re also planning to run other tests for RAID arrays on Windows Azure. In particular:
- Re-test RAID 0 and 1 in the multi-thread mode (we expect the results to be different)
- Try to improve read performance by changing the “read ahead” parameter in RAID configuration
- Test how a real database will perform on RAID disks
So, what do you think about our findings on Windows Azure’s RAID disks?.. Feel free to leave your comments below or follow @altoros to stay tuned with all the latest updates.