I recently moved my files to a new zfs-pool and used that chance to properly configure my datasets.
This led me to discovering zfs-deduplication.
As most of my storage is used by my jellyfin library (~7-8Tb), which is mostly uncompressed bluray rips I thought I might be able to save some storage using deduplication in addition to compression.
Has anyone here used that for similar files before? What was your experience with it?
I am not too worried about performance. The dataset in question is rarely changed. Basically only when I add more media every couple of months. I also have overshot my cpu-target when originally configuring my server so there is a lot of headroom there. I have 32Gb of ram which is not really fully utilized either (but I also would not mind upgrading to 64 too much).
My main concern is that I am unsure it is useful. I suspect just because of the amount of data and similarity in type there would statistically be a lot of block-level duplication but I could not find any real world data or experiences on that.
Just adjust it if you actually need the RAM and it isn’t relinquishing quickly enough.
options zfs zfs_arc_max=17179869184
in /etc/modprobe.d/zfs.conf,update-initramfs -u
, reboot - this will limit ZFS ARC to 16GiB.arc_summary
to see what it’s using now.As for using a simple fs on LVM, do you not care about data integrity?