Virtualbox dynamically allocated vs fixed size VDI drive performance benchmark
Is there really a difference in speed?
I’ve been working more and more with Virtualbox these days. One of the dilemmas you get in the beginning is whether to create a fixed size or dynamically allocated virtual drive for the storage. The first one wastes quite a lot of space on the hardware if the space is not actually in use in the virtual machine, but while it’s supposed to be a tad faster. With the second one, the size of the drive is only as much as that is actually in use in the virtual machine. The maximum size can be anything that fits on your hardware (maybe even bigger, I haven’t tested that). So take for example these two virtual drives:
50 GB fixed size => takes up 50 GBs of space on your actual drive
50 GB dynamically allocated, of which 6 GB is effectively in use by a Linux OS => +- 6 GB will be the size on your actual drive
On the internet you can find some statements of people mentioning that dynamically allocated virtual drives tend to be a bit slower than fixed size drives. I felt challenged to find out if that’s actually true! So I went ahead and did some tests to see that if in my case, there is a significant difference in benchmark speed.
The hardware and the benchmark
I’m using a Samsung 850 EVO 256 GB solid state drive as the hardware drive of my Antergos linux operating system. The virtual machine will be run in that same Linux OS on that drive. I’ll also perform tests on a Hitachi Ultrastar A7K3000 2TB hard disk drive, this is a data disk that is used as a secondary data disk. So no operating system or virtual machine is running on it during the tests. Both drives are formatted as a BTRFS filesystem. The virtual drives are using the standard Virtualbox VDI format.
The benchmarks are just simple tests performed using the gnome-disks tool in an Antergos Live CD session. The standard settings are used: 10MB size, read and write 100 samples, access time 1000 samples.
In the virtual machine SATA settings the option host I/O cache is switched on and off and two difference benchmarks are made to see the difference between both, because they have a big influence on the performance due to caching being used.
Results on the primary SSD
This is the result, running the benchmark on my main SSD:
With host I/O cache disabled, that would result in:
This is a big difference! When caching isn’t enabled, the fixed size drive performance statistically crashes. Please note that even if the results with caching enabled look more promising, they’re not always valid for a real world scenario, because caches have limits: filesize limits, number of item limits etc. There can be many cases where I/O throughput can’t hit the cache because the cache can’t handle that data for some reason. So being more realistic, the second results should be more down to the real world. Also note that Virtualbox VMs have read caching automatically enabled, so that’s why you still see very high throughputs for the read benchmark of the dyn. allocated drive. I can’t immediately say why that’s not the case for the fixed size virtual drive.
But those numbers are still good, no? For an SSD, you don’t see that much of a bottleneck with the storage. Let’s compare it quickly to a bare metal benchmark to feel the difference more. Due to this drive being my primary drive where my host Antergos Linux is running on, I can’t perform a write benchmark with the gnome-disks tool. Even if that’s the case, here are the read results: 556 MB/s. That’s +- 50 MB/s higher on bare metal. It’s higher, but still a very acceptable decrease to get all the advantages of a VM, don’t you agree?
Results on the secondary data HDD
SSD’s are awesome, we all get that! But I wanted to see results for when the virtual drive is located on a hard disk drive. Note that no RAID or anything special is used, just plain “simple” BTRFS as a filesystem. If you know a bit about that filesystem and know that it’s a Copy On Write filesystem, you understand that the write performance is a bit slower than when the virtual drives are located on for example EXT4 and XFS host filesystems. So even with these results here, you could expect them to be a bit higher on other filesystems.
Talking about HDDs now, here are the first results with host caching enabled:
Looking sexy, right?! We like big numbers!!!! Still, keep in mind that these speeds are practical in some cases, but not all!
Without host caching:
Seeing these results makes me realise that these numbers are outside of what my hard disk can physically perform, so I realise that even without host caching enabled, the virtual machines also have caching enabled in their environment! That’s something that I’ve overlooked before.
These results look interesting, to see that the dynamically allocated drives outperform the fixed ones for these tests. But as said before, bear in mind that these are tests with small data (10 MB), and that the host or MV caching mechanism can shoot into action to improve read and write performance for these tests. If you would have scenarios where huge GigaBytes of data are transferred in the VM, the speeds could be completely different as the caching mechanisms can’t handle such big data due to RAM size limitations. That could maybe be a reason to give your VMs as much RAM as possible, so be sure to test that out to see the difference.
But still, all these tests don’t really show proof that fixed size disks are the way to go for performance. So to avoid wasted space, I’d recommend considering dynamically allocated sizes first.
You can give me feedback below in the comments.
I made use of this tool to make the charts, to make the results more appealing: