Blog

Performance of RAID Arrays on Windows Azure: an Alternative to Horizontal Scaling

Sergey Balashevich

While working with several different NoSQL databases heavily loaded with write requests, we faced a situation when the hard drive became a bottleneck. Scaling the cluster horizontally could easily solve this kind of problem, but it would also increase the monthly payments. This is why we decided to take a look at other options.

The first thing that comes to mind when a DB starts experiencing HDD performance issues is to combine several virtual drives into a RAID array, but how will it work with Windows Azure virtual infrastructure? To check this, we compared the performance of a single virtual drive and different RAID arrays (types: 0, 1, 4, 5, and 6) using the Bonnie++ tool for hard drive subsystem verification.

Below you will find the test results and step-by-step instructions on how to configure a RAID array on your own.

 

Test 1: RAID performance under Write/Read/Re-write workloads

In the first test, we measured the performance of different RAID arrays for simple read/write operations:

sudo bonnie++ -d /raid1/ -m 'raid1' -u root -n 100:8192:16384:20 -x10 -s 16g -f > raid1.csv

Bonnie++ was run 10 times (-x10). Each test worked with 100 files of 8-16 KB in size and 20 subdirectories. In total, there were 16 GB of “files” in each iteration. Since a large Windows Azure instance has 7 GB of RAM, we had a chance to avoid caching.

You can see the first test results below. The x-axis stands for megabytes per second, the y-axis indicates repetitions (we ran each test 10 times).

Write test results:

Write_test_x2


Read test results:

Read_test_x2

Re-write test results:

Rewrite_test_x2

Input/output operations per second:

iops.jpg

According to these results, RAID 0 demonstrated the best performance in the write test and almost the same results in the read test. The IOPS values ranged from 350 to 450 for all RAID types.

 

Test 2: Performance of 2, 4, and 8 drives in a RAID array

Here, we wanted to see how the number of virtual drives affected RAID performance. For this test, we started two more large instances. One of them was used to create a level 0 RAID array of four drives and the other one for a level 0 RAID array of eight virtual drives:

  • four drives:

    sudo mdadm --create /dev/md0 --level=0 --raid-devices=4 /dev/sdc /dev/sdd /dev/sde /dev/sdf

  • eight drives:
     

    sudo mdadm --create /dev/md0 --level=0 --raid-devices=8 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj

We ran the same test on both VMs:

sudo bonnie++ -d /raid0/ -m 'raid0' -u root -n 100:16384:16384:20 -x10 -s 16g -f > raid0.csv

You can see the results on the diagrams below:

RAID_Read_Write_test_x2

 

Test 3: Stability check

The last test was to check how stable Azure storage is. Here, we consequently tested RAID performance under a workload for 8+ hours.

RAID_0-40_Iterations_3

 

Summary

Given the results of all three tests, we can make the following conclusions:

  1. Windows Azure virtual drives work faster when combined in a RAID array. A level 0 array is the fastest storage.
  2. The number of virtual drives has the biggest effect on write operations. A RAID 0 array with eight nodes works 4.5 times faster than a single virtual drive.
  3. When performing read operations, eight nodes work two times faster than a single drive.
  4. Virtual disks work faster when in use.

Nameless_Diagram
 
Red and blue lines represent the servers that had already been used in Test 1. Green and yellow lines stand for the new servers that we started for Test 2. Every time a new server is created and starts using HDD heavily, it takes ~30 minutes for it to accelerate to the maximum speed.

 

How to configure a RAID array

The following steps will help you to set up a RAID array for your own project:

  1. Attach empty drives to a VM using a CLI or a Web interface. The number of disks that can be attached depends on the VM’s size.
  2. Use the following commands to find the names of the disks:
     
  3. a) CentOS:

    vi /proc/partitions

    b) Ubuntu:

    sudo lshw -class disk

  4. Combine them into a RAID array (all the disks from step one must be listed in this command):
     
  5. sudo mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/sdd /dev/sde

  6. Mount the newly created RAID disk to the file system using the following commands:
  7. sudo mkfs.ext3 /dev/md0

    sudo mount /dev/md0 /raid0

That’s it. Now you can start using your new disk. Still it is worth reading the “Saving your RAID configuration” section here.

 

More RAID performance results

We’re also planning to run other tests for RAID arrays on Windows Azure. In particular:

  1. Re-test RAID 0 and 1 in the multi-thread mode (we expect the results to be different)
  2. Try to improve read performance by changing the “read ahead” parameter in RAID configuration
  3. Test how a real database will perform on RAID disks

So, what do you think about our findings on Windows Azure’s RAID disks?.. Feel free to leave your comments below or follow @altoros to stay tuned with all the latest updates.

Related posts:

7 Comments

Benchmarks and Research

Subscribe to new posts

Get new posts right in your inbox!