Scaleway - ARMv8
Martin Rusev at Anom.cx wrote a great analysis of the Scaleway ARMv7 and Digtial Ocean droplets some time ago (not sure exactly when but around 2015 from the github comment). Yesterday I received an email from Scaleway announcing their ARMv8 machines so I thought I would give one a go. Martin Rusev noted that against a DO droplet there was certainly a drop in performance with the ARMv7. So let's see how the v8 gets on.
I booted up a 2C ARMv8 with 2GB RAM as well as a 4C ARMv8 with 4GB RAM. It took around 10 minutes to get a console on one of these in the Paris DC. I would imagine that since the Amsterdam DC is out of stock of a lot of these they are getting quite a lot of demand.
CPU Test
To begin with I run the single core test, along with 2C and 4C (because 4C was run on Martin Rusev's test too).
Single Core
[email protected]:~# sysbench --test=cpu --cpu-max-prime=20000 run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 1
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 20000
Test execution summary:
total time: 29.9082s
total number of events: 10000
total time taken by event execution: 29.9016
per-request statistics:
min: 2.94ms
avg: 2.99ms
max: 10.56ms
approx. 95 percentile: 3.02ms
Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 29.9016/0.00
Dual Core
root[email protected]:~# sysbench --test=cpu --cpu-max-prime=20000 run --num-threads=2
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 2
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 20000
Test execution summary:
total time: 14.9265s
total number of events: 10000
total time taken by event execution: 29.8405
per-request statistics:
min: 2.94ms
avg: 2.98ms
max: 3.42ms
approx. 95 percentile: 3.00ms
Threads fairness:
events (avg/stddev): 5000.0000/0.00
execution time (avg/stddev): 14.9203/0.00
Quad Core
[email protected]:~# sysbench --test=cpu --cpu-max-prime=20000 --num-threads=4 run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 4
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 20000
Test execution summary:
total time: 14.9632s
total number of events: 10000
total time taken by event execution: 59.8098
per-request statistics:
min: 2.94ms
avg: 5.98ms
max: 20.55ms
approx. 95 percentile: 11.01ms
Threads fairness:
events (avg/stddev): 2500.0000/1.22
execution time (avg/stddev): 14.9525/0.01
ARMv8 4Core 4GB RAM
Running the test with following options:
Number of threads: 1
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 20000
Test execution summary:
total time: 29.8729s
total number of events: 10000
total time taken by event execution: 29.8657
per-request statistics:
min: 2.96ms
avg: 2.99ms
max: 8.31ms
approx. 95 percentile: 3.00ms
Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 29.8657/0.00
[email protected]:~# sysbench --test=cpu --cpu-max-prime=20000 --num-threads=4 run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 4
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 20000
Test execution summary:
total time: 7.5198s
total number of events: 10000
total time taken by event execution: 30.0429
per-request statistics:
min: 2.94ms
avg: 3.00ms
max: 15.02ms
approx. 95 percentile: 3.01ms
Threads fairness:
events (avg/stddev): 2500.0000/5.79
execution time (avg/stddev): 7.5107/0.00
I/O
[email protected]:~# sysbench --test=fileio --file-total-size=6G prepare
[email protected]:~# sysbench --test=fileio --file-total-size=6G --file-test-mode=rndrw --max-time=300 --max-requests=0 --file-extra-flags=direct run
Operations performed: 336060 Read, 224040 Write, 716822 Other = 1276922 Total
Read 5.1279Gb Written 3.4186Gb Total transferred 8.5464Gb (29.172Mb/sec)
1866.99 Requests/sec executed
Test execution summary:
total time: 300.0018s
total number of events: 560100
total time taken by event execution: 156.6031
per-request statistics:
min: 0.21ms
avg: 0.28ms
max: 10.57ms
approx. 95 percentile: 0.32ms
Threads fairness:
events (avg/stddev): 560100.0000/0.00
execution time (avg/stddev): 156.6031/0.00
ARMv8 4Core 4GB RAM
Operations performed: 327880 Read, 218587 Write, 699392 Other = 1245859 Total
Read 5.0031Gb Written 3.3354Gb Total transferred 8.3384Gb (28.462Mb/sec)
1821.55 Requests/sec executed
Test execution summary:
total time: 300.0013s
total number of events: 546467
total time taken by event execution: 152.3080
per-request statistics:
min: 0.22ms
avg: 0.28ms
max: 17.58ms
approx. 95 percentile: 0.34ms
Threads fairness:
events (avg/stddev): 546467.0000/0.00
execution time (avg/stddev): 152.3080/0.00
MySQL
mysql Ver 14.14 Distrib 5.7.18, for Linux (aarch64) using EditLine wrapper
ARMv8 2Core 2GB RAM
[email protected]:~# sysbench --test=oltp --oltp-table-size=1000000 --mysql-db=benchmark --mysql-user=root --mysql-password=password123 prepare
[email protected]:~# sysbench --test=oltp --oltp-table-size=1000000 --mysql-db=benchmark --mysql-user=root --mysql-password=password123 --max-time=60 --oltp-read-only=on --max-requests=0 --num-threads=8 run
sysbench 0.4.12: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 8
Doing OLTP test.
Running mixed OLTP test
Doing read-only test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 7 times)
Done.
OLTP test statistics:
queries performed:
read: 276500
write: 0
other: 39500
total: 316000
transactions: 19750 (329.05 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 276500 (4606.75 per sec.)
other operations: 39500 (658.11 per sec.)
Test execution summary:
total time: 60.0206s
total number of events: 19750
total time taken by event execution: 479.8237
per-request statistics:
min: 16.70ms
avg: 24.29ms
max: 84.53ms
approx. 95 percentile: 26.47ms
Threads fairness:
events (avg/stddev): 2468.7500/37.75
execution time (avg/stddev): 59.9780/0.01
ARMv8 4Core 4GB RAM
No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 8
Doing OLTP test.
Running mixed OLTP test
Doing read-only test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 7 times)
Done.
OLTP test statistics:
queries performed:
read: 537012
write: 0
other: 76716
total: 613728
transactions: 38358 (639.13 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 537012 (8947.79 per sec.)
other operations: 76716 (1278.26 per sec.)
Test execution summary:
total time: 60.0161s
total number of events: 38358
total time taken by event execution: 479.5496
per-request statistics:
min: 5.95ms
avg: 12.50ms
max: 33.54ms
approx. 95 percentile: 13.75ms
Threads fairness:
events (avg/stddev): 4794.7500/46.71
execution time (avg/stddev): 59.9437/0.00
Results ARMv8 against published ARMv7
CPU Single Core Total Time (s) |
CPU Quad Core Total Time (s) |
Disk I/O Mb/sec |
MySQL Transactions per sec |
MySQL read/write requests per sec |
|
ARMv7 | 685.512 | 171.3962 | 16.347 | 467.07 | 6539.00 |
ARMv8 2C 2GB RAM | 29.9082 | 14.9632 | 29.172 | 329.05 | 4606.75 |
ARMv8 4C 4GB RAM | 29.8729 | 7.5198 | 28.462 | 639.13 | 8947.79 |
So the CPU and Disk I/O show performance increases (the ARMv8 CPU performs the prime calculations faster than the DO droplet Martin Rusev used. However as you can see there is a performance decrease in the MySQL results of the 2 core ARMv8 vs the ARMv7. I would imagine this is because the 4C ARMv7 used has a performance advantage over the faster clockspeed but lower number of CPU cores of the ARMv8 2 core.
Conclusion
Well after spending a week migrating my production webserver from a 1GB Single-Core DO droplet to a 2GB Single-Core Linode VM last month I am now wondering if I it would be beneficial to go for one of Scaleways 4C ARMv8 with 8GB RAM which costs the same price as the Linode VM.
My only real concern for production is the support and stability as I'm new Scaleway and they don't have the best reviews yet. My own experience hasn't been smooth sailing with the brief interactions I have had with them. They locked my account I'd set up a few years ago and faffed around telling me I couldn't use that email account anymore and I'd have to use a new one. After escalating the support thread to a manager it was resolved. I'm sure these customer service type issues will iron themselves out in the future.
In the meantime I am going to stick with this small ARMv8 and play around with it. Personally I really hope these micro dedicated servers become the norm. No more noisy neighbours and great performance for the price.
Comments
comments powered by Disqus