PTS on AWS with m1.medium EC2 and EBS with Provisioned IOPS

More results running PTS with PIOPS 1000 and 2000

I took a look at AWS m1.medium EC2 instances with Provisioned IOPS. This post is light on instructions and is basically just graphs of performance results.

test configurations & results

  • medium-attached-ebs: m1.medium EC2 instance, standard EBS volume (ext4)

  • medium-attached-ebs-piops1000: m1.medium EC2 instance, EBS volume (ext4) provisioned for 1000 IOPS

  • medium-attached-ebs-piops2000: m1.medium EC2 instance, EBS volume (ext4) provisioned for 2000 IOPS

  • small-attached-ebs: m1.small EC2 instance, standard EBS volume (ext4)

  • small-attached-ebs-piops1000: m1.small EC2 instance, EBS volume (ext4) provisioned for 1000 IOPS

  • small-attached-ebs-piops2000: m1.small EC2 instance, EBS volume (ext4) provisioned for 2000 IOPS

comparison

Note: each test is actually run several times, with the below scores representing the average. The “error” column is the maximum deviation from the average by one or more test runs.

AIO-STRESS
AIO-Stress v0.21: random write
configuration score error
→ MB/s, more is better
medium-attached-ebs 55.19 ± 1.1
medium-attached-ebs-piops1000 53.41 ± 0.4
medium-attached-ebs-piops2000 62.16 ± 0.67
small-attached-ebs 15.87 ± 0.51
small-attached-ebs-piops1000 35.9 ± 0.47
small-attached-ebs-piops2000 40.95 ± 0.25
FLEXIBLE IO TESTER
Flexible IO Tester v1.57: Intel IOMeter File Server Access Pattern
configuration score error
← seconds, less is better
medium-attached-ebs 3190.45 ± 78.95
medium-attached-ebs-piops1000 1469.41 ± 2.32
medium-attached-ebs-piops2000 802.17 ± 0.47
small-attached-ebs 1234.69 ± 286.56
small-attached-ebs-piops1000 1469.18 ± 2.64
small-attached-ebs-piops2000 802.62 ± 1.14

This is one of the tests where m1.medium without PIOPS fared dramatically worse than m1.small without PIOPS. I re-ran these tests with the same results. If your application is similar to this benchmark, you may want to consider sticking with m1.small.

SQLITE
SQLite v3.7.3: 12,500 INSERTs
configuration score error
← seconds, less is better
medium-attached-ebs 140.8 ± 4.61
medium-attached-ebs-piops1000 131.95 ± 1.29
medium-attached-ebs-piops2000 103.78 ± 6.34
small-attached-ebs 156.35 ± 3.04
small-attached-ebs-piops1000 75.76 ± 0.52
small-attached-ebs-piops2000 89.05 ± 2.02

I didn’t expect these results, but PIOPS 1000 and 2000 appear to test better on the m1.small than they do on the m1.medium.

FS-MARK
FS-Mark v3.3: 1000 Files, 1MB Size
configuration score error
→ files/s, more is better
medium-attached-ebs 21.63 ± 0.22
medium-attached-ebs-piops1000 31.67 ± 0.09
medium-attached-ebs-piops2000 33.23 ± 0.09
small-attached-ebs 27.1 ± 1.12
small-attached-ebs-piops1000 31.7 ± 0.06
small-attached-ebs-piops2000 33.27 ± 0.09

Here’s another benchmark where m1.small no-PIOPS beats m1.medium no-PIOPS. If your application deals with loads of small files, you’d be better off sticking with m1.small. PIOPS 1000 and PIOPS 2000 perform roughly the same, regardless of instance size.

DBENCH

The combination of the m1.medium’s CPU and Provisioned IOPS allows the 128-client test to produce the best results here. Without either of these, the 48-client and 12-client tests tend to be the sweet spot.

Dbench v4.0: 1 Client
configuration score error
→ MB/s, more is better
medium-attached-ebs 42.94 ± 0.68
medium-attached-ebs-piops1000 50.4 ± 1.58
medium-attached-ebs-piops2000 55.49 ± 0.77
small-attached-ebs 43.72 ± 0.46
small-attached-ebs-piops1000 82.46 ± 2.45
small-attached-ebs-piops2000 69.69 ± 1.49
Dbench v4.0: 12 Clients
configuration score error
→ MB/s, more is better
medium-attached-ebs 146.55 ± 0.95
medium-attached-ebs-piops1000 160.23 ± 0.25
medium-attached-ebs-piops2000 175.21 ± 0.39
small-attached-ebs 128.37 ± 1.16
small-attached-ebs-piops1000 119.3 ± 0.03
small-attached-ebs-piops2000 126.7 ± 0.58
Dbench v4.0: 48 Clients
configuration score error
→ MB/s, more is better
medium-attached-ebs 167.18 ± 1.9
medium-attached-ebs-piops1000 189.05 ± 0.39
medium-attached-ebs-piops2000 215.73 ± 0.65
small-attached-ebs 129.76 ± 1.18
small-attached-ebs-piops1000 120.72 ± 0.63
small-attached-ebs-piops2000 129.03 ± 0.97
Dbench v4.0: 128 Clients
configuration score error
→ MB/s, more is better
medium-attached-ebs 161.12 ± 7.73
medium-attached-ebs-piops1000 194.07 ± 0.84
medium-attached-ebs-piops2000 224.12 ± 3.05
small-attached-ebs 115.54 ± 0.67
small-attached-ebs-piops1000 108.84 ± 0.25
small-attached-ebs-piops2000 123.01 ± 0.66

The m1.small results are fairly similar regardless of PIOPS. Meanwhile, the m1.medium results show performance scalining linearly with the level of PIOPS. This is a benchmark where PIOPS didn’t really shine on the m1.small, but clearly does on the m1.medium.

As usual, the results for 1-client are very different to the multi-user tests. As such, it’s only a mere curiousity that the m1.small beats m1.medium (with matching PIOPS).

IOZONE

The behaviour of m1.medium without PIOPS is very suspicious here, but I did run the PTS twice on different m1.medium EC2 instances with similar results. Of course, I provisioned them one after the other, so there is a chance I got the same aberant hardware twice in a row. :P

IOzone v3.405: 8GB Read Performance
configuration score error
→ MB/s, more is better
medium-attached-ebs 80.87 ± 1.61
medium-attached-ebs-piops1000 40.59 ± 0.0
medium-attached-ebs-piops2000 40.85 ± 0.0
small-attached-ebs 34.89 ± 0.02
small-attached-ebs-piops1000 34.86 ± 0.2
small-attached-ebs-piops2000 34.73 ± 0.01

The CPU difference gives m1.medium the advantage in the read test.

IOzone v3.405: 8GB Write Performance
configuration score error
→ MB/s, more is better
medium-attached-ebs 20.92 ± 0.27
medium-attached-ebs-piops1000 33.55 ± 0.02
medium-attached-ebs-piops2000 33.85 ± 0.02
small-attached-ebs 33.73 ± 0.29
small-attached-ebs-piops1000 33.66 ± 0.0
small-attached-ebs-piops2000 34.26 ± 0.17

All combinations perform the write test equally well here. The exception is m1.medium without PIOPS, which loses by a large margin.

THREADED I/O TESTER
Threaded I/O Tester v0.3.3: 64MB Random Read - 32 Threads
configuration score error
→ MB/s, more is better
medium-attached-ebs 584.97 ± 1.28
medium-attached-ebs-piops1000 487.95 ± 0.89
medium-attached-ebs-piops2000 585.0 ± 0.2
small-attached-ebs 122.08 ± 3.14
small-attached-ebs-piops1000 11.54 ± 0.04
small-attached-ebs-piops2000 23.64 ± 0.15

The m1.medium’s CPU destroys the m1.small on the 32-thread read test.

Threaded I/O Tester v0.3.3: 64MB Random Write - 32 Threads
configuration score error
→ MB/s, more is better
medium-attached-ebs 6.39 ± 0.27
medium-attached-ebs-piops1000 4.4 ± 0.01
medium-attached-ebs-piops2000 8.87 ± 0.01
small-attached-ebs 13.03 ± 0.41
small-attached-ebs-piops1000 4.4 ± 0.0
small-attached-ebs-piops2000 8.88 ± 0.02

Writing data is a different story. m1.small with no-PIOPS actually beats m1.medium by a descent margin.

COMPILE BENCH

For these tests, PIOPS 2000 doesn’t seem to have much advantage over PIOPS 1000. Predictably, m1.medium with PIOPS rules the roost here.

Compile Bench v0.6: Test: Compile
configuration score error
→ MB/s, more is better
medium-attached-ebs 24.33 ± 0.72
medium-attached-ebs-piops1000 48.63 ± 0.77
medium-attached-ebs-piops2000 50.76 ± 0.52
small-attached-ebs 27.87 ± 0.96
small-attached-ebs-piops1000 35.36 ± 0.04
small-attached-ebs-piops2000 35.35 ± 0.0
Compile Bench v0.6: Test: Initial Create
configuration score error
→ MB/s, more is better
medium-attached-ebs 25.31 ± 0.58
medium-attached-ebs-piops1000 36.59 ± 0.2
medium-attached-ebs-piops2000 41.61 ± 0.49
small-attached-ebs 24.17 ± 1.13
small-attached-ebs-piops1000 28.68 ± 0.04
small-attached-ebs-piops2000 28.75 ± 0.15

For the compile and create tests, m1.small matches m1.medium. That is, until you turn on PIOPS.

Compile Bench v0.6: Test: Read Compiled Tree
configuration score error
→ MB/s, more is better
medium-attached-ebs 226.7 ± 0.5
medium-attached-ebs-piops1000 199.67 ± 0.35
medium-attached-ebs-piops2000 231.08 ± 0.76
small-attached-ebs 46.82 ± 5.14
small-attached-ebs-piops1000 47.89 ± 0.64
small-attached-ebs-piops2000 49.1 ± 0.54
UNPACKING THE LINUX KERNEL
Unpacking The Linux Kernel: linux-2.6.32.tar.bz2
configuration score error
← seconds, less is better
medium-attached-ebs 25.0 ± 0.79
medium-attached-ebs-piops1000 24.8 ± 0.32
medium-attached-ebs-piops2000 24.4 ± 0.3
small-attached-ebs 48.8 ± 1.11
small-attached-ebs-piops1000 51.0 ± 0.82
small-attached-ebs-piops2000 46.48 ± 0.35
POSTMARK
PostMark v1.51: Disk Transaction Performance
configuration score error
→ TPS, more is better
medium-attached-ebs 1503 ± 10.82
medium-attached-ebs-piops1000 1320 ± 2.33
medium-attached-ebs-piops2000 1521 ± 3.0
small-attached-ebs 614 ± 5.36
small-attached-ebs-piops1000 606 ± 1.45
small-attached-ebs-piops2000 634 ± 2.33
GZIP COMPRESSION
Gzip Compression: 2GB File Compression
configuration score error
← seconds, less is better
medium-attached-ebs 28.11 ± 0.32
medium-attached-ebs-piops1000 28.42 ± 0.34
medium-attached-ebs-piops2000 28.2 ± 0.23
small-attached-ebs 55.39 ± 1.36
small-attached-ebs-piops1000 60.36 ± 1.16
small-attached-ebs-piops2000 56.85 ± 1.1
POSTGRESQL PGBENCH
PostgreSQL pgbench v8.4.11: TPC-B Transactions Per Second
configuration score error
→ TPS, more is better
medium-attached-ebs 259.82 ± 3.55
medium-attached-ebs-piops1000 290.75 ± 1.35
medium-attached-ebs-piops2000 319.63 ± 3.26
small-attached-ebs 250.39 ± 3.3
small-attached-ebs-piops1000 301.47 ± 4.62
small-attached-ebs-piops2000 335.26 ± 17.91
APACHE BENCHMARK
Apache Benchmark v2.4.3: Static Web Page Serving
configuration score error
→ Requests/s, more is better
medium-attached-ebs 2317.53 ± 33.94
medium-attached-ebs-piops1000 2065.23 ± 12.76
medium-attached-ebs-piops2000 2285.03 ± 6.83
small-attached-ebs 1233.63 ± 2.23
small-attached-ebs-piops1000 1135.76 ± 4.5
small-attached-ebs-piops2000 1233.82 ± 9.55

observations

Now that I have some results from both m1.small and m1.medium EC2 instances, it’s clear exactly which tests are affected by CPU performance (although CPUs are not the focus of this series of articles).

These are the tests that were very CPU-bound, with little variation in results when only PIOPS differed:

  • Compile Bench: Read Compiled Tree
  • Unpacking the Linux Kernel
  • PostMark
  • Gzip Compression
  • Apache Benchmark

There were a few tests where performance scaled with PIOPS rather than CPU horsepower:

  • Flexible IO Tester
  • FSMark
  • IOZone
  • PostgreSQL PGBench

PIOPS 2000 was over 10% better than PIOPS 1000 in the following cases:

  • Flexible IO Tester
  • SQLite
  • Threaded IO Tester
  • Compile Bench: Create & Read
  • PostMark

It’s difficult to making more observations on the value of PIOPS with the following benchmarks producing anomolous results on m1.medium without PIOPS (i.e. m1.small being faster):

  • Flexible IO Tester
  • FSMark
  • DBench: 1-client
  • IOZone: write
  • Threaded IO Tester: write

conclusions

It seems as though m1.medium EC2 instances have a regression (compared to m1.smalls) for cases where many smaller writes to EBS (without PIOPS) are being requested. Superficially, this seems illogical: m1.medium instances enjoy a better network interface, more RAM and a faster CPU.

Perhaps basic EBS becomes a major write bottleneck with a faster CPU?

In many cases, PIOPS 1000 was faster than no-PIOPS, particularly the cases with strange no-PIOPS results. If you are using m1.medium EC2 instances with applications that favour storage IO, then you really ought to consider PIOPS 1000 at least.

series

  1. Phoronix Test Suite on AWS
  2. PTS Results on Provisioned IOPS on AWS EBS
  3. PTS Results on Provisioned IOPS on AWS EBS, Part 2
  4. PTS on AWS with m1.medium EC2 and EBS with Provisioned IOPS

Ron -