Scaleway - ARMv8

 

Martin Rusev at Anom.cx wrote a great analysis of the Scaleway ARMv7 and Digtial Ocean droplets some time ago (not sure exactly when but around 2015 from the github comment). Yesterday I received an email from Scaleway announcing their ARMv8 machines so I thought I would give one a go. Martin Rusev noted that against a DO droplet there was certainly a drop in performance with the ARMv7. So let's see how the v8 gets on.

I booted up a 2C ARMv8 with 2GB RAM as well as a 4C ARMv8 with 4GB RAM. It took around 10 minutes to get a console on one of these in the Paris DC. I would imagine that since the Amsterdam DC is out of stock of a lot of these they are getting quite a lot of demand.

CPU Test

To begin with I run the single core test, along with 2C and 4C (because 4C was run on Martin Rusev's test too).

Single Core

[email protected]:~# sysbench --test=cpu --cpu-max-prime=20000 run
sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000

Test execution summary:
total time: 29.9082s
total number of events: 10000
total time taken by event execution: 29.9016
per-request statistics:
min: 2.94ms
avg: 2.99ms
max: 10.56ms
approx. 95 percentile: 3.02ms

Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 29.9016/0.00

Dual Core

[email protected]:~# sysbench --test=cpu --cpu-max-prime=20000 run --num-threads=2

sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 2

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000

Test execution summary:
total time: 14.9265s
total number of events: 10000
total time taken by event execution: 29.8405
per-request statistics:
min: 2.94ms
avg: 2.98ms
max: 3.42ms
approx. 95 percentile: 3.00ms

Threads fairness:
events (avg/stddev): 5000.0000/0.00
execution time (avg/stddev): 14.9203/0.00

Quad Core

[email protected]:~# sysbench --test=cpu --cpu-max-prime=20000 --num-threads=4 run
sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000


Test execution summary:
total time: 14.9632s
total number of events: 10000
total time taken by event execution: 59.8098
per-request statistics:
min: 2.94ms
avg: 5.98ms
max: 20.55ms
approx. 95 percentile: 11.01ms

Threads fairness:
events (avg/stddev): 2500.0000/1.22
execution time (avg/stddev): 14.9525/0.01


ARMv8 4Core 4GB RAM

 

Running the test with following options:
Number of threads: 1

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          29.8729s
    total number of events:              10000
    total time taken by event execution: 29.8657
    per-request statistics:
         min:                                  2.96ms
         avg:                                  2.99ms
         max:                                  8.31ms
         approx.  95 percentile:               3.00ms

Threads fairness:
    events (avg/stddev):           10000.0000/0.00
    execution time (avg/stddev):   29.8657/0.00

 

[email protected]:~# sysbench --test=cpu --cpu-max-prime=20000 --num-threads=4 run
sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          7.5198s
    total number of events:              10000
    total time taken by event execution: 30.0429
    per-request statistics:
         min:                                  2.94ms
         avg:                                  3.00ms
         max:                                 15.02ms
         approx.  95 percentile:               3.01ms

Threads fairness:
    events (avg/stddev):           2500.0000/5.79
    execution time (avg/stddev):   7.5107/0.00

 

I/O

[email protected]:~# sysbench --test=fileio --file-total-size=6G prepare
[email protected]:~# sysbench --test=fileio --file-total-size=6G --file-test-mode=rndrw --max-time=300 --max-requests=0 --file-extra-flags=direct run

Operations performed: 336060 Read, 224040 Write, 716822 Other = 1276922 Total
Read 5.1279Gb Written 3.4186Gb Total transferred 8.5464Gb (29.172Mb/sec)
1866.99 Requests/sec executed

Test execution summary:
total time: 300.0018s
total number of events: 560100
total time taken by event execution: 156.6031
per-request statistics:
min: 0.21ms
avg: 0.28ms
max: 10.57ms
approx. 95 percentile: 0.32ms

Threads fairness:
events (avg/stddev): 560100.0000/0.00
execution time (avg/stddev): 156.6031/0.00

ARMv8 4Core 4GB RAM

 Operations performed:  327880 Read, 218587 Write, 699392 Other = 1245859 Total
Read 5.0031Gb  Written 3.3354Gb  Total transferred 8.3384Gb  (28.462Mb/sec)
 1821.55 Requests/sec executed

Test execution summary:
    total time:                          300.0013s
    total number of events:              546467
    total time taken by event execution: 152.3080
    per-request statistics:
         min:                                  0.22ms
         avg:                                  0.28ms
         max:                                 17.58ms
         approx.  95 percentile:               0.34ms

Threads fairness:
    events (avg/stddev):           546467.0000/0.00
    execution time (avg/stddev):   152.3080/0.00


MySQL

mysql Ver 14.14 Distrib 5.7.18, for Linux (aarch64) using EditLine wrapper 

ARMv8 2Core 2GB RAM

[email protected]:~# sysbench --test=oltp --oltp-table-size=1000000 --mysql-db=benchmark --mysql-user=root --mysql-password=password123 prepare

[email protected]:~# sysbench --test=oltp --oltp-table-size=1000000 --mysql-db=benchmark --mysql-user=root --mysql-password=password123 --max-time=60 --oltp-read-only=on --max-requests=0 --num-threads=8 run

sysbench 0.4.12: multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 8

Doing OLTP test.
Running mixed OLTP test
Doing read-only test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 7 times)
Done.

OLTP test statistics:
queries performed:
read: 276500
write: 0
other: 39500
total: 316000
transactions: 19750 (329.05 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 276500 (4606.75 per sec.)
other operations: 39500 (658.11 per sec.)

Test execution summary:
total time: 60.0206s
total number of events: 19750
total time taken by event execution: 479.8237
per-request statistics:
min: 16.70ms
avg: 24.29ms
max: 84.53ms
approx. 95 percentile: 26.47ms

Threads fairness:
events (avg/stddev): 2468.7500/37.75
execution time (avg/stddev): 59.9780/0.01

ARMv8 4Core 4GB RAM

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 8

Doing OLTP test.
Running mixed OLTP test
Doing read-only test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 7 times)
Done.

OLTP test statistics:
    queries performed:
        read:                            537012
        write:                           0
        other:                           76716
        total:                           613728
    transactions:                        38358  (639.13 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 537012 (8947.79 per sec.)
    other operations:                    76716  (1278.26 per sec.)

Test execution summary:
    total time:                          60.0161s
    total number of events:              38358
    total time taken by event execution: 479.5496
    per-request statistics:
         min:                                  5.95ms
         avg:                                 12.50ms
         max:                                 33.54ms
         approx.  95 percentile:              13.75ms

Threads fairness:
    events (avg/stddev):           4794.7500/46.71
    execution time (avg/stddev):   59.9437/0.00

 

Results ARMv8 against published ARMv7

 

 

CPU Single Core

Total Time (s)

CPU Quad Core

Total Time (s)

Disk I/O

Mb/sec

MySQL Transactions

per sec

MySQL read/write

requests per sec

ARMv7 685.512 171.3962 16.347 467.07 6539.00
ARMv8 2C 2GB RAM 29.9082 14.9632 29.172 329.05 4606.75
ARMv8 4C 4GB RAM 29.8729 7.5198 28.462 639.13 8947.79


So the CPU and Disk I/O show performance increases (the ARMv8 CPU performs the prime calculations faster than the DO droplet Martin Rusev used. However as you can see there is a performance decrease in the MySQL results of the 2 core ARMv8 vs the ARMv7. I would imagine this is because the 4C ARMv7 used has a performance advantage over the faster clockspeed but lower number of CPU cores of the ARMv8 2 core.

Conclusion

Well after spending a week migrating my production webserver from a 1GB Single-Core DO droplet to a 2GB Single-Core Linode VM last month I am now wondering if I it would be beneficial to go for one of Scaleways 4C ARMv8 with 8GB RAM which costs the same price as the Linode VM.

My only real concern for production is the support and stability as I'm new Scaleway and they don't have the best reviews yet. My own experience hasn't been smooth sailing with the brief interactions I have had with them. They locked my account I'd set up a few years ago and faffed around telling me I couldn't use that email account anymore and I'd have to use a new one. After escalating the support thread to a manager it was resolved. I'm sure these customer service type issues will iron themselves out in the future. 

In the meantime I am going to stick with this small ARMv8 and play around with it. Personally I really hope these micro dedicated servers become the norm. No more noisy neighbours and great performance for the price. 

 

 

Installing bcl2fastq from source without bonkers symlinking or copying libraries to other directories.

 

Between my final year of PhD (when I was also working as IBERS and IMAPS HPC SysAdmin) my life consisted of installing software into a module type system since HPC environments have multiple pieces of software installed, often the same software but with different versions and you fundamentally do not use a package manager for user software as it'll cause you a world of pain. One annoyance I always found was when you look for help when a piece of software fails to compile and you get an answer that boils down to;

yum install PACKAGE 

or

apt-get install PACKAGE

And this happened today. bcl2fastq, the Illumina software is all kinds of fun and games (USING MAKE FILES TO DEMULTIPLEX RAW DATA???? WHY?????). The install instructions are your usual ./configure/make....so you do your ./configure and all is well until you get the following error...

boost-1_44_0 installed successfully
-- Successfuly built boost 1.44.0 from the distribution package...
-- Check if the system is big endian
-- Searching 16 bit integer
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of unsigned short
-- Check size of unsigned short - done
-- Using unsigned short
-- Check if the system is big endian - little endian
-- Looking for floorf
-- Looking for floorf - found
-- Looking for round
-- Looking for round - found
-- Looking for roundf
-- Looking for roundf - found
-- Looking for powf
-- Looking for powf - found
-- Looking for erf
-- Looking for erf - found
-- Looking for erf
-- Looking for erf - found
-- Looking for erfc
-- Looking for erfc - found
-- Looking for erfc
-- Looking for erfc - found
CMake Error at cmake/cxxConfigure.cmake:74 (message):
  No support for gzip compression
Call Stack (most recent call first):
  c++/CMakeLists.txt:33 (include)


-- Configuring incomplete, errors occurred!
Couldn't configure the project:

/software/testing/bcl2fastq/1.8.4/build/bootstrap/bin/cmake -H"/software/testing/bcl2fastq/1.8.4/src/bcl2fastq/src" -B"/software/testing/bcl2fastq/1.8.4/build" -G"Unix Makefiles"  -DCASAVA_PREFIX:PATH=/software/testing/bcl2fastq/1.8.4/x86_64 -DCASAVA_EXEC_PREFIX:PATH= -DCMAKE_INSTALL_PREFIX:PATH=/software/testing/bcl2fastq/1.8.4/x86_64 -DCASAVA_BINDIR:PATH= -DCASAVA_LIBDIR:PATH= -DCASAVA_LIBEXECDIR:PATH= -DCASAVA_INCLUDEDIR:PATH= -DCASAVA_DATADIR:PATH= -DCASAVA_DOCDIR:PATH= -DCASAVA_MANDIR:PATH= -DCMAKE_BUILD_TYPE:STRING=RelWithDebInfo

Moving CMakeCache.txt to CMakeCache.txt.removed

My first thought was to quickly look to see if we have libz libraries installed on the HPC as a module, and they're not. Fair enough. I then wondered why there wasn't a libz library already installed by the OS, and there is but it seems to be different between the software node (a node dedicated to installing software so as not to annoy folk who are on the login node) and the compute nodes. So pointing to /lib64 would probably not work (it might if bcl2fastq is doing a static binary, I've not checked).

[[email protected]]$ locate libz
/lib64/libz.so.1
/lib64/libz.so.1.2.3
[[email protected]]$ locate libz
-bash: locate: command not found
[[email protected]]$ echo "grrr"
grrr
[[email protected]]$ ls -lath /lib64/libz*
lrwxrwxrwx 1 root root  13 Dec 13 14:09 /lib64/libz.so.1 -> libz.so.1.2.7
-rwxr-xr-x 1 root root 89K Nov  5 18:09 /lib64/libz.so.1.2.7

Okay so after a quick google I get the same old hacky responses;

https://biogist.wordpress.com/2012/10/23/casava-1-8-2-installation/

https://www.biostars.org/p/11202/

http://seqanswers.com/forums/showthread.php?t=11106

My next thought then was how about I just install libz from source. So,

[[email protected]]$ source git-1.8.1.2
[[email protected]]$ which git
[[email protected]]$ git clone https://github.com/madler/zlib.git

YES, EVEN GIT HAS VERSIONS!!! yum/apt isn't the answer to everything!

[[email protected] ]$ cd zlib/
[[email protected] ]$ ./configure
[[email protected] ]$ make -j4
[[email protected] ]$ ls -lath libz*s*
lrwxrwx--- 1 martin JIC_c1 14 Jan 27 16:11 libz.so.1 -> libz.so.1.2.11
lrwxrwx--- 1 martin JIC_c1 14 Jan 27 16:11 libz.so -> libz.so.1.2.11
-rwxrwx--x 1 martin JIC_c1 103K Jan 27 16:11 libz.so.1.2.11

And that's great. We now have our libraries compiled, just need to let my bash shell know where they are;

export LIBRARY_PATH=/software/testing/bcl2fastq/1.8.4/lib/zlib

and then back into my bcl2fastq build directory, rerun ./configure --prefix=/where/the/bins/go and it compiled.

All done without a yummy apt....it's Friday and I need to go home.

Weird awk error when messing around with making a GFF from a TXT file

 

Strange error when trying to mess around with a text file created from STATA in Windows. When using awk to create the 9 column GFF file to use with SignalMap awk goes a little weird. The original data looks like this;

CHR1	101	.17989999
CHR1	151	.083400011
CHR1	301	-.125
CHR1	451	0
CHR1	501	.16670001
CHR1	601	.69999999
CHR1	651	.33329999
CHR1	751	.75
CHR1	801	0
CHR1	901	.25099999

And when you try to create a GFF file using awk, it goes weird like this;

[[email protected]]$ head Sample.txt |awk '{if($3>=0) print $1"\t.\tSAMPLE\t"$2"\t"$2+49"\t"$3"\t.\t.\t."}'
CHR1	.	.AMPLE	.01	150	.17989999
CHR1	.	.AMPLE	.51	200	.083400011
CHR1	.	.AMPLE	.51	500	0
CHR1	.	.AMPLE	.01	550	.16670001
CHR1	.	.AMPLE	.01	650	.69999999
CHR1	.	.AMPLE	.51	700	.33329999
CHR1	.	.AMPLE	.51	800	.75
CHR1	.	.AMPLE	.01	850	0
CHR1	.	.AMPLE	.01	950	.25099999

After 10 mins of banging my head on the table I realised that it was probably something to do with Windows/Unix formatting. So this solved it;

[[email protected]]$ dos2unix -n sample.txt sample_new.txt
[[email protected]]$ head sample_new.txt |awk '{if($3>=0) print $1"\t.\tSAMPLE\t"$2"\t"$2+49"\t"$3"\t.\t."}'
CHR1	.	SAMPLE 101	150	.17989999	.	.
CHR1	.	SAMPLE 151	200	.083400011	.	.
CHR1	.	SAMPLE 451	500	0	.	.
CHR1	.	SAMPLE 501	550	.16670001	.	.
CHR1	.	SAMPLE 601	650	.69999999	.	.
CHR1	.	SAMPLE 651	700	.33329999	.	.
CHR1	.	SAMPLE 751	800	.75	.	.
CHR1	.	SAMPLE 801	850	0	.	.
CHR1	.	SAMPLE 901	950	.25099999	.	.

Getting SignalMap GFF files into IGV

You can already load a GFF file in IGV

undefined

This will allow you to load your file into IGV, however it will be slow especially for many tracks. So you may wish to convert the GFF to a bedgraph file, and then from there create a TDF file.

Convert GFF into a bedgraph file

cat input.gff | awk '{print $1"\t"$4-1"\t"$5"\t"$6}' > output.bed

and then convert this bed file to a TDF file which can be done using igvtIGV. 

In IGV, go to the 'Tools' -> 'Run igvtools' menu at the top and you will get the following box;

undefined

Ensure the 'Command' is set to "toTDF", set the 'Input File' to your bedfile, the 'Output File' will be filled in automagically (unless you want to change it) and then set the 'Zoom Levels' to 10.

This will create a new tdf file which is much smaller than the bedfile and contains the histogram you would have had in SignalMap.

NOTE: Still to write up

  • Give visuals of signal map/IGV
  • Command-line method of doing this

 

Newer posts → Home ← Older posts