Benchmarking on the Helios4
2021-11-08Helios4
Toggle Table of Contents
If you don’t know about the Helios4, you can read about it here.
First of all, during benchmarking a lot of things can go wrong (I really recommend watching these 5 mintues of a lightning talk about benchmarking). Hence, take the results with a grain of salt. While I’m not doing any fancy calculations here, I am still not an expert. During the tests the Helios was always idle, or, at least, I did not do anything else. For instance, it could have happened that a cronjob was run during a benchmark, but I don’t think that this should have a high impact. Further, I was running the tests (with bonnie) multiple times.
The Helios4 has support for four drives but I use it currently with 2x4TB drives.
At the end of this post, you’ll find a table summarizing the results.
Running OMV with EXT4 on LVM on RAID 1
For the first few tests, I actually installed OpenMediaVault (Helios4 Wiki) as I never used it before and wanted to try it out. I literally followed the instructions on the Official Kobol wiki which meant
- setting up my RAID 1 (mirror)
- installing the LVM plugin and creating the volumes
- creating the EXT4 filesystem
I installed and activated SMB as well and after I was able to connect to it
finally (permission things…), I used dd
to push some data over the network
to get a feeling for the speed.
It’s important to note that I did not change any configration or tried to make any optimizations. This was run with the default settings.
SMB Performance
As you can see in the command, I wanted to copy 10 GB originally, but aborted at 3.1 GB, which took 160s at a rate of about 20 MB/s. Using the same command to read the written data again resulted in about the same speed.
w=20MB/s, r=20MB/s
╭─max@host /run/user/1000/gvfs/smb-share:server=helios4,share=data/test
╰─$ dd if=/dev/zero of=test.img bs=1M count=10000 status=progress
3283091456 bytes (3,3 GB, 3,1 GiB) copied, 160 s, 20,5 MB/s^C
3132+0 records in
3132+0 records out
3284140032 bytes (3,3 GB, 3,1 GiB) copied, 160,051 s, 20,5 MB/s
╭─max@host /run/user/1000/gvfs/smb-share:server=helios4,share=data/test
╰─$ dd if=test.img of=/dev/null bs=1M status=progress
3275751424 bytes (3,3 GB, 3,1 GiB) copied, 166 s, 19,7 MB/s
3132+0 records in
3132+0 records out
3284140032 bytes (3,3 GB, 3,1 GiB) copied, 166,45 s, 19,7 MB/s
Copying a 15 GB video file from my computer onto the Helios4 took almost
exactly 13 minutes, which is pretty much the rate shown by dd
: 15.000 MB /
(13*60s) = 19.2 MB/s.
I was a little confused about this speed. The Helios4 has a gigabit ethernet port, my computer and all other involved components as well. Theoretically, this means a throughput of about 125 MB/s. I did not expect to reach this speed in practice, but only 20 MB/s seemed to be a bit low?
SSHFS Performance
Next, I tried SSHFS. You don’t need anything special, the only requirement is the ssh daemon and ssh is running on all my servers, anyway.
Again, I did not change any configurations or made any optimizations. This was run with the default settings when you mount a folder with SSHFS.
The usage of sshfs
is sshfs [user@]host:[dir] mountpoint [options]
. The
directoy on the Helios4 is mounted to ~/test
locally, which you can achieve
with
sshfs helios4:/srv/dev-disk-by-uuid-dfb36876-37ed-4860-9a86-ecf608a3d986/data/test ~/test/
I used the same dd
command but with count=1000
so that 1 GB is transferred
because I did not want to wait for 10 GB.
w=29MB/s, r=31MB/s
╭─max@host ~/test
╰─$ dd if=/dev/zero of=test.img bs=1M count=1000 status=progress
1044381696 bytes (1,0 GB, 996 MiB) copied, 35 s, 29,6 MB/s
1000+0 records in
1000+0 records out
1048576000 bytes (1,0 GB, 1000 MiB) copied, 35,5097 s, 29,5 MB/s
╭─max@host ~/test
╰─$ dd if=test.img of=/dev/null bs=1M status=progress
1023410176 bytes (1,0 GB, 976 MiB) copied, 33 s, 31,0 MB/s
1000+0 records in
1000+0 records out
1048576000 bytes (1,0 GB, 1000 MiB) copied, 33,8286 s, 31,0 MB/s
I transmitted the same 15 GB video file as above as well, which took about 8 minutes: 15.000 MB / (8*60s) = 31 MB/s. That’s definitely better than 19.2 MB/s.
So roughly, SSHFS seems to be about 10 MB/s faster than SMB. Interesting. This actually spiked my interest to actually do some benchmarking on the Helios4 itself. Up until now, I only wanted to get a feel for transmission speeds, which I could expect.
Test on the Helios4 locally
I connected to the Helios4 with ssh
and run basically the same dd
test as
before, this time with a 10 GB file (count=10000
). This shows the writing and
reading speeds to the disk if no network is involved.
w=158MB/s, r=212Mb/s
╭─max@helios4 /srv/dev-disk-by-uuid-dfb36876-37ed-4860-9a86-ecf608a3d986/data/test
╰─$ dd if=/dev/zero of=test.img bs=1M count=10000 status=progress
5071962112 bytes (5.1 GB, 4.7 GiB) copied, 32 s, 158 MB/s^C
4944+0 records in
4944+0 records out
5184159744 bytes (5.2 GB, 4.8 GiB) copied, 32.7079 s, 158 MB/s
╭─max@helios4 /srv/dev-disk-by-uuid-dfb36876-37ed-4860-9a86-ecf608a3d986/data/test
╰─$ dd if=test.img of=/dev/null bs=1M status=progress
5084545024 bytes (5.1 GB, 4.7 GiB) copied, 24 s, 212 MB/s
4944+0 records in
4944+0 records out
5184159744 bytes (5.2 GB, 4.8 GiB) copied, 24.5122 s, 211 MB/s
Great! Writing happens with about 150 MB/s and reading at about 210 MB/s. This
shows that the disk I/O is definitely not the limiting factor. Now, let’s do a
test with a proper tool: bonnie++
Test with Bonnie++
With bonnie++
(website) you can test
the performance of your filesystem and hard drives. The manpage tells you
everything you need to know, but let’s look at the used options quickly:
-d
sets the directory for the test-c 1
the level of concurrency-s 4024
the size of the file(s) for IO performance measures in megabytes. 4024 is twice the size the RAM of the Helios4 (2G)-n 1
the number of files for the file creation test (measured in multiples of 1024 files)-f
specified without a parameter, this skips the per-char IO tests.-b
no write buffering, sofsync()
is called after every write.
I really like the result:
w=146MB/s, rw=78MB/s, r=177MB/s
/usr/sbin/bonnie++ -d /srv/dev-disk-by-uuid-dfb36876-37ed-4860-9a86-ecf608a3d986/data/test/perform -c 1 -s 4024 -n 1 -f -b
Version 1.98 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
helios4 4024M 146m 48 78.1m 34 177m 45 147.1 5
Latency 185ms 383ms 104ms 435ms
Version 1.98 ------Sequential Create------ --------Random Create--------
helios4 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
1 1672561147 0 +++++ +++ -1046580777 0 -1081091648 0 +++++ +++ -898267616 0
Latency 177ms 276us 90292us 59943us 24us 103ms
EXT4 on RAID 1
Next, I removed OpenMediaVault and built my own setup. First, I created an EXT4 filesytem directly on the RAID 1 to compare the result to the previous test. The difference is that no LVM is used anymore. Both tests were run with no encryption in use.
The RAID 1 can be accessed with the device /dev/md0
.
The following command creates the EXT4 filesystem. By default, a block size of 4096 is used. This is not relevant now, but it will be later when a LUKS container is used for encryption.
sudo mkfs.ext4 /dev/md0
Mount and run the test.
sudo mount /dev/mapper/cryptroot /mnt/md0
/usr/sbin/bonnie++ -d /mnt/md0/ -c 1 -s 4024 -n 1 -f -b
Result: w=160MB/s, rw=96MB/s, r=160MB/s
Version 1.98 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
helios4 4024M 160m 54 96.6m 39 160m 37 145.0 4
Latency 170ms 287ms 194ms 471ms
Version 1.98 ------Sequential Create------ --------Random Create--------
helios4 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
1 -569212263 0 +++++ +++ 1545298178 0 1958692144 0 +++++ +++ 1912692157 0
Latency 199ms 264us 116ms 119ms 40us 91260us
Second run: w=164MB/s, rw=95MB/s, r=167MB/s
Version 1.98 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
helios4 4024M 164m 53 95.9m 38 167m 39 146.9 4
Latency 176ms 517ms 95391us 400ms
Version 1.98 ------Sequential Create------ --------Random Create--------
helios4 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
1 -222359118 0 +++++ +++ 1549587127 0 -1213966720 0 +++++ +++ -2058643209 0
Latency 194ms 268us 100ms 105ms 17us 89368us
Interesting! While reading seems to be a bit slower, the writing and rw value are slightly higher than on the setup with OVM.
EXT4 on LUKS on RAID 1
Instead of creating the EXT4 filesystem directly on the RAID 1 device, we will use a LUKS container to provide encryption and create an EXT4 filesystem on top of it.
It’s important to note that these results cannot be compared to the previous speed measurements because no encryption was used there. As you will see, the additional computing power for the encryption takes its performance toll.
First, let’s briefly look at the two 4 TB drives. Both, /dev/sda
and
/dev/sdb
report a physical sector size of 4096 bytes and a logical size of
512 bytes. Hence, it should be ensured that the LUKS container and the
filesystem use a block (sector) size of 4096 to use the drive efficiently.
helios4:~:% sudo hdparm -I /dev/sda | grep 'Sector size'
Logical Sector size: 512 bytes
Physical Sector size: 4096 bytes
helios4:~:% sudo hdparm -I /dev/sdb | grep 'Sector size'
Logical Sector size: 512 bytes
Physical Sector size: 4096 bytes
Unaligned LUKS Sector Size and EXT4 Block Size
The LUKS container was created with default options:
sudo cryptsetup luksFormat /dev/md0
And by default, LUKS uses a sector size of 512 bytes. Information about a LUKS
container can be read with the cryptsetup
command as follows.
helios4:md0:% sudo cryptsetup status /dev/mapper/cryptroot
/dev/mapper/cryptroot is active and is in use.
type: LUKS2
cipher: aes-xts-plain64
keysize: 512 bits
key location: keyring
device: /dev/md0
sector size: 512
offset: 32768 sectors
size: 7813740160 sectors
mode: read/write
The LUKS container can be opened with sudo cryptsetup open /dev/md0 cryptroot
.
Thereafter, the EXT4 filesystem was created with default options, too:
sudo mkfs.ext4 /dev/mapper/cryptroot
The default block size is 4096 as can be seen with command dumpe2fs
:
helios4:md0:% sudo dumpe2fs /dev/mapper/cryptroot | grep 'Block size'
dumpe2fs 1.44.5 (15-Dec-2018)
Block size: 4096
The sector size of the LUKS container and the block size of the filesystem are not identical. This is not recommended and will likely result in a performance loss. But I wanted to know how big the difference between aligned and unaligned sector/blocksize actually is.
A friendly reminder about the speed with an unencrypted EXT4 filesystem on the RAID 1:
w=164MB/s, rw=95MB/s, r=167MB/s
Bonnie++ is run with /usr/sbin/bonnie++ -d /mnt/md0/ -c 1 -s 4024 -n 1 -f -b
.
Result: w64MB/s, rw=34MB/s, r=53MB/s
Version 1.98 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
helios4 4024M 64.1m 20 34.1m 13 53.8m 12 150.5 5
Latency 1146ms 1184ms 73276us 361ms
Version 1.98 ------Sequential Create------ --------Random Create--------
helios4 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
1 -1499258492 0 +++++ +++ 1469852849 0 162904477 0 +++++ +++ -1983072567 0
Latency 193ms 281us 171ms 167ms 20us 210ms
Second run: w=64MB/s, rw=30MB/s, r=53MB/s
Version 1.98 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
helios4 4024M 64.2m 20 30.6m 12 53.1m 12 147.4 4
Latency 1044ms 922ms 155ms 312ms
Version 1.98 ------Sequential Create------ --------Random Create--------
helios4 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
1 693104966 0 +++++ +++ -1357348910 0 1029287914 0 +++++ +++ 1121994707 0
Latency 230ms 270us 108ms 162ms 39us 142ms
I did a few more runs but they all yielded the same results. As you can see, the speed drops significantly.
Aligned LUKS Sector Size and EXT4 Block Size
The following setup uses the correct sector size of 4096 for the LUKS container, which slightly improves the benchmark results.
The commands to create the setup:
sudo cryptsetup luksFormat --sector-size 4096 /dev/md0
sudo cryptsetup open /dev/md0 cryptroot
Now, the sector size is 4096:
helios4:~:% sudo cryptsetup status /dev/mapper/cryptroot
/dev/mapper/cryptroot is active.
type: LUKS2
cipher: aes-xts-plain64
keysize: 512 bits
key location: keyring
device: /dev/md0
sector size: 4096
offset: 32768 sectors
size: 7813740160 sectors
mode: read/write
Create the EXT4 filesystem.
sudo mkfs.ext4 /dev/mapper/cryptroot
sudo dumpe2fs /dev/mapper/cryptroot | grep 'Block size'
The block size of EXT4 is again 4096 (as the last time).
sudo mount /dev/mapper/cryptroot /mnt/md0
/usr/sbin/bonnie++ -d /mnt/md0/ -c 1 -s 4024 -n 1 -f -b
Running the test with bonnie++
returns the following result, which is indeed
a few megabytes better than previously. In my opinion, the encryption is worth
it and I’ll happily take the performance trade-off.
w=73MB/s, rw=37MB/s, r=59MB/s
Version 1.98 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
helios4 4024M 73.3m 23 37.4m 14 59.4m 14 143.5 4
Latency 728ms 748ms 97800us 387ms
Version 1.98 ------Sequential Create------ --------Random Create--------
helios4 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
1 1961043105 0 +++++ +++ -1124638686 0 -110386393 0 +++++ +++ 1338972754 0
Latency 244ms 288us 148ms 210ms 25us 186ms
Second go: w=73MB/s, rw=32MB/s, r=59MB/s
Version 1.98 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
helios4 4024M 73.4m 23 32.2m 13 59.5m 14 152.3 3
Latency 694ms 618ms 78881us 397ms
Version 1.98 ------Sequential Create------ --------Random Create--------
helios4 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
1 289626538 0 +++++ +++ -1096656038 0 1782974649 0 +++++ +++ 665618027 0
Latency 173ms 260us 110ms 170ms 24us 102ms
Encryption Offloaded to the CESA Unit
Then I remembered that the Helios4 has a CESA unit (Cryptographic Engines and Security Accelerator unit), which can be used to offload encryption. I thought that using the CESA unit will improve the benchmark results further, but the results were quite disappointing.
While creating the LUKS container, the cipher aec-cbc-essiv:sha256
must be
specified to use the unit. This can be achieved with the following command. We
keep the correct sector size, too.
sudo cryptsetup -c aes-cbc-essiv:sha256 luksFormat --sector-size 4096 /dev/md0
Open the container.
sudo cryptsetup open /dev/md0 cryptroot
Verify cipher and sector size.
helios4:~:% sudo cryptsetup status /dev/mapper/cryptroot
/dev/mapper/cryptroot is active.
type: LUKS2
cipher: aes-cbc-essiv:sha256
keysize: 256 bits
key location: keyring
device: /dev/md0
sector size: 4096
offset: 32768 sectors
size: 7813740160 sectors
mode: read/write
Create filesystem.
sudo mkfs.ext4 /dev/mapper/cryptroot
Check block size.
helios4:~:% sudo dumpe2fs /dev/mapper/cryptroot | grep 'Block size'
dumpe2fs 1.44.5 (15-Dec-2018)
Block size: 4096
Let’s mount the partition and start the test!
sudo mount /dev/mapper/cryptroot /mnt/md0
/usr/sbin/bonnie++ -d /mnt/md0/ -c 1 -s 4024 -n 1 -f -b
w=48MB/s, rw=29MB/s, r=60MB/s
Version 1.98 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
helios4 4024M 48.8m 16 29.9m 12 60.7m 14 145.8 5
Latency 1150ms 1852ms 58338us 347ms
Version 1.98 ------Sequential Create------ --------Random Create--------
helios4 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
1 -2068442525 0 +++++ +++ 1658309631 0 -747844300 0 +++++ +++ 614890684 0
Latency 171ms 262us 155ms 173ms 27us 139ms
second run: w=48MB/s, rw=27MB/s, r=59MB/s
Version 1.98 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
helios4 4024M 48.6m 16 27.5m 11 59.4m 13 147.9 5
Latency 1193ms 1148ms 93477us 356ms
Version 1.98 ------Sequential Create------ --------Random Create--------
helios4 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
1 -932939759 0 +++++ +++ 827095864 0 1905523812 0 +++++ +++ -1495813491 0
Latency 181ms 270us 214ms 238ms 21us 140ms
Well, the result is not very convincing. With the encryption offloaded to the CESA unit, the results are worse than before.
Table of Results with Bonnie++
All setups use a RAID 1 and the EXT4 filesystem with a block size of 4096.
The exception is the EXT4 filesystem created by OMV. I’m not sure which block size was used as I created it with the OVM web GUI and I didn’t check it in the terminal. However, I assume that the default block size of 4096 was used.
Nonetheless, if another block size than 4096 was used, this might explain the worse performance of the setup of OMV with EXT4 on LVM on RAID 1 versus the plain EXT4 on the RAID 1; but perhaps this is just the indirection layer of LVM?
Setup | LUKS sector size | encryption | write MB/s | read-write MB/s | read MB/s |
---|---|---|---|---|---|
OMV - EXT4 on LVM | no luks | none | 146 | 78 | 177 |
EXT4 | no luks | none | 160 | 96 | 160 |
EXT4 on LUKS | 512 | aes-xts-plain64 | 64 | 34 | 53 |
EXT4 on LUKS | 4096 | aes-xts-plain64 | 73 | 37 | 59 |
EXT4 on LUKS (CESA) | 4096 | aes-cbc-essiv:sha256 | 48 | 29 | 60 |
Regarding the CESA unit, there exists a HTTPS Benchmark in the official wiki.
Out of Scope
There may be many aspects, which affect the measurement itself as well as how you evaluate the results. For instance, I did not look at the energy usage or CPU usage while the HTTPS benchmark takes the CPU usage into account.
Personally, I settled on the RAID 1 with an EXT4 filesystem on a LUKS container (with the correct sector size, of course). This provides encryption, which is a must-have, and reasonable speed. Furthermore, no other software but ssh is required, which is already running anyway.