Segfault > IT > General > Filesystem performance on SSD

(07.Nov.2011)


Introduction

I am using since more than one year nilfs2 for the SSD that I have in my notebook - I'm very happy with it and had 0 problems even after having switched off the notebook without a proper shutdown.

I don't want (have to) squeeze each single byte out of the FS, so even if performance is not extreme (no big difference if the throughput is 150 or 200MB/s when starting the browser) I am more than happy to be able to use the nilfs2-feature of the continuous snapshots which helped me already more than once to recover files deleted by mistake.

In any case, as I was wondering what the current situation was within the most important filesystems supported in Linux, I did a simple benchmark of all of them.

The final objectives of my tests were to find out...
1) if any filesystem has major problems with my SSD and...
2) if the performance of nilfs2 was still acceptable compared to the other filesystems.

The scope was not to do something exhaustive, but just to cover the usual cases I am confronted to when dealing with Gentoo: having to deal with a lot of small files (e.g. when refresing the portage tree which is Gentoo's main package repository, or when packages are uncompressed before compilation) and having to deal with big files (e.g. when I create a file to be used as fs-image for a virtual OS).
I suppose that the read-performance is at least as good as the write-performance and I therefore benchmarked only writes.

Please don't take this as a general example of how these filesystems perform - these were simple tests done on a SSD to be used for general usage.
I do know that e.g. when changing from SSD to a normal HDD or to a USB-stick or a CompactFlash card or when focusing on storage used by databases, the results look very very different. E.g. have a look here for results of such tests performed on a USB-stick.
Even the method of writing the small files to the SSD from a zip-file can be seen as wrong as the operation of unzipping them probably slows down the write performance.

And again as mentioned here: in my case the simple tests are the ones that count the most - doing the ones I mention below is for me a quite good baseline to understand how a filesystem performs. In this case I don't have to benchmark fancy databases queried by millions of users, blocksizes which I don't want to use, etc... .


Hardware

The base HRDW I used to perform these tests is quite old:

  • Motherboard: Asus P5Q (ICH10 controller), SATA-2
  • CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
  • RAM: 4GB
  • SSD: Crucial m4 SSD 128GB, SATA-3

As you can see I have only a SATA-2 controller => results might look differently when using a SATA-3 controller.

Why did I choose a SSD that has a Marvell controller instead of the super-famous Sandforce?
Well, it's not the first time that I buy a SSD-drive - the first time was something like 1 year ago and in both cases I kept reading about people that were complaining about their Sandforce-2 drives not working anymore.
We can discuss if the Sandforce-2-community is more extroverted regarding problems, but in any case I didn't feel like giving it a try so in the end I always bought products that had a different controller ("Corsair Nova V128 SSD MLC" using an Indilinx controller and now the "Crucial m4 SSD" using a Marvell controller).


Software

The Linux kernel I used was the version 3.1.0 on a Gentoo Linux distribution:

Linux ssdtest 3.1.0-gentoo #1 SMP Sun Nov 6 14:23:27 CET 2011 x86_64 Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz GenuineIntel GNU/Linux

The CPU was maxed all the time to 2.40GHz with the following:

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor

The versions of the filesystem utilities that I used were:

sys-fs/e2fsprogs-1.41.14
sys-fs/xfsprogs-3.1.4
sys-fs/nilfs-utils-2.0.21
sys-fs/jfsutils-1.1.15
sys-fs/reiserfsprogs-3.6.21
sys-fs/btrfs-progs-0.19-r3

Concerning the test using the big file I prepared 2GB of data coming from an image of the openSUSE installation CD:

dd if=openSUSE-11.0-DVD-x86_64.iso of=/root/openSUSE-11.0-DVD-x86_64.iso bs=16k count=131072

For the test using small files I prepared a zip file containing 22683 directories and 126229 small files using Gentoo's package repository ("Portage"):

zip -r /root/dacanc/zipped.zip *

The OS was installed on a normal HDD and the SSD partition was empty, excluding the files I was writing for the tests.

The 128GB SSD was partitioned using only the first 100GB leaving the remaining space free to be used for whatever the controller wants to do to avoid cell wearing:

Disk /dev/sdb: 128.0 GB, 128035676160 bytes
76 heads, 13 sectors/track, 253106 cylinders, total 250069680 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xd91f7e36

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *        2048   209717247   104857600   83  Linux


Test scripts

The tests I performed were very simple.

  • Write the 2GB file
    • cat openSUSE-11.0-DVD-x86_64.iso > /dev/null
      (this caches the file into the 4GBs of RAM and therefore avoids that the test results are worse due to the slower HDD that acts as source.
    • time cp -v openSUSE-11.0-DVD-x86_64.iso /mnt/memstick/ && time sync
      The results of the two "time" commands were added up.
      For the "Performance" results I added up the result of the "real" line returned by the "time command.
      For the "CPU" results I added up the results of the "user" and "sys" lines.
  • Overwrite the 2GB file
    Same as above.
  • Write the 22683 dirs and 126229 small files
    • cat zipped.zip > /dev/null
      Again, to cache the source file.
    • time unzip -q -o zipped.zip -d /mnt/memstick/ && time sync
      You can see that I'm using the "-q" flag to avoid that the slow output to the console when unpacking the file slows down the performance.
      Otherwise the same as when running the test with the 2GB-file.
  • Overwrite the 22683 dirs and 126229 small files
    Same as above

Format and mount options

Options/Filesystem ext2 ext3 ext4 btrfs jfs nilfs reiserfs xfs ntfs
Format default -E discard -E discard default default default default default default
Mount noatime,
async

noatime,
async,
barrier=0,
data=writeback

noatime,
async,
nobarrier,
discard
noatime,
async,
ssd
noatime,
async
noatime,
async,
discard
noatime,
async,
barrier=none
noatime,
async,
nobarrier,
discard
default

Performance benchmark (seconds)

Test/Filesystem ext2 ext3 ext4 btrfs jfs nilfs reiserfs xfs ntfs
Write 2GB file 12 11 11 12 11 16 12 12 40
Overwrite 2GB file 12 12 12 13 11 18 12 12 40
Write 22683 dirs and 126229 small files 11 12 12 17 13 13 15 13 78

Overwrite 22683 dirs and 126229 small files

9

12 13 20 14 14 17 139 86

All the Ext-filesystems performed well.

Sadly Btrfs still did not surprise me with anything and was a bit slow when overwriting the dirs & small files (happened all the time when repeating the test).
You'll have to need some of its features in order to want it.

JFS showed no special flaws.
Unluckily it doesn't seem to have any special features so you might end up using it only if required by some special application or hardware.

NILFS2 was a bit slow when over/writing the 2GB-file.

ReiserFS was alright.

XFS had some major problems when overwriting the dirs & small files.
I reran the test again and again, as well with different mount-options, but I just didn't manage to improve the result.
Once I was done with the tests, I deleted all the files manually instead of reformatting the drive with the next filesystem and the deletion was as well extremely slow and had to wait for more or less the same amount of time.
No clue why it behaves like this... .

NTFS:
Ok, I was curious and wanted to check if my subjective feeling of Windows being slow was correct (what shall I say :oP ... ) so I transferred the zip- and the 2GB-file to a Win7-PC (using the same motherboard + RAM + CPU but clocked at 2.8Ghz instead of 2.4) and ran the tests after having re-partitioned and formatted the SSD with Windows' default settings.
To run the tests I used "xcopy" for both the zip-file (I unpacked it to a HDD in advance - directly unpacking the zip-file using WinZip was slower) and the 2GB-file - I didn't find any faster method than this.
I reran the tests several times (after deleting the files on the SSD and clearing the recycle bin) but the performance was always a disaster.
Windows-pros: you can blame me, but only if you tell me how to make this faster.


CPU time (seconds)

Test/Filesystem ext2 ext3 ext4 btrfs jfs nilfs reiserfs xfs ntfs
Write 2GB file 2.513

4.091

2.629

2.4 2.604 3.512 4.737 2.163 n/a
Overwrite 2GB file 2.679 4.495

3.042

2.55 2.864 4.609 5.385 2.404 n/a
Write 22683 dirs
and 126229 small files
7.351 8.445 8.303 11.117 8.485 7.789

12.838

9.948 n/a
Overwrite 22683 dirs
and 126229 small files

7.303

8.95 9.424 15.546 9.314 8.346 16.086 13.253 n/a

Everything OK, with the exclusion of Btrfs and ReiserFS - you might think twice before using these two filesystem in combination with a weak CPU.

No clue what Btrfs is computing.

ReiserFS is the one that uses the most CPU when dealing with small files.
The reason might be because of the small amount of storage that they end up using - see the next results.


Used space (GBs)

Test/Filesystem ext2 ext3 ext4 btrfs jfs nilfs reiserfs xfs ntfs
Initial, just after format 0.058 0.183 0.183 0 0.012 0.015 0.032 0.032 n/a
After writing 2GB file 2.1 2.2 2.2 2.1 2.1 2.1 2.1 2.1 n/a

After writing 2GB file + the
22683 dirs and 126229 small files

2.8 2.9 2.9 2.7 2.7 2.8 2.3 2.7 n/a

No big differences here, with the exception of ReiserFS which is the clear winner when dealing with small files.

Quoting the man-page: "By default, reiserfs stores small files and `file tails' directly into  its tree."


Conclusions

  • I can still stick with Nilfs2 as it did not show any major problems.
    I like its feature of the continuous snapshots and the speed it writes to the SSD is more than enough for general usage.
    It's the only filesystem that writes using the whole storage space of the partition in a round-robin way deleting the oldest stuff when space is depleted and is therefore perfect to avoid cell wearing on the SSD (hoping that coupled with what the controller of the SSD does, it does not achieve the exact opposite target :o) )
  • XFS has serious problems when overwriting and deleting many small directories and/or files but otherwise it's OK.
    To be used only under special circumstances - e.g. I use it on a 2TB sw-raid5 in combination with GlusterFS.
  • Btrfs is not outstanding in any discipline.
    To be used only under special circumstances - e.g. when wanting to use its raid0/1 functionality without having to use mdadm.
  • JFS seems to be OK.
  • The ext-filesystems are all very good.
    I wouldn't want to use ext2 (and 3?) on a big HDD as running fsck takes ages.
    Perhaps ext4?
  • ReiserFS is great at saving space on the device when dealing with small files.

Nilfs2 seems to still be alright - I will therefore continue to use it for my notebook's SSD.