Reporting correct space usage for TrueNAS samba shares
TrueNAS is a great and mature piece of software. By leveraging the ZFS filesystem and providing access to the data using various protocols along with a simple, easy to use webinterface, it has become the defacto standard for SOHO self built NAS systems. By default the windows shares exposed by samba on TrueNAS will show a correct, but unintuitive space estimate. The issue is that shares, even if they belong to the same pool, will report different volume sizes when mounted on a client.
This is because in ZFS the pool which holds all data is usually split into multiple filesystems and ZFS behaves a bit different to normal filesystems when it comes to calculating the free space. ZFS does not directly expose the total size of a filesystem, instead it provides the size used by every object inside the specified filesystem as well as the remaining free storage, which is, unless quotas are applied, the same for all filesystems on a single pool. When samba asks the filesystem for a usage report it calculates the total volume size by adding up those numbers and reports it back to the client. The benefit of this approach is that an empty volume will always show as empty on the client which makes the reporting more accurate. The downside however is that volumes will shrink in size for no apparent reason when data is added to other filesystems on the same pool. Usually the expected behaviour is for all shares on the same pool to share the same usage and total size and for the total size to stay the same, unless the pool configuration is changed. Digging through the configuration manual for samba I stumbled across the dfree command option. This option can be used to replace the internal routine for calculating the total disk space with an external command. The program is invoked with a path inside the filesystem being queried and should output two integers which represent the total disk space and the remaining free disk space for the given path in blocks. The default block size is 1024, however the script can optionally return the used block size as a third return. A quick test showed that the option works on TrueNAS and directly changes the values returned to a client.
In order to use this for reporting the values we need a program that gets the name of the pool used for the passed directory, reads its used and available values, converts both into blocks and echos them to the console. I opted to write a short shell script for the task and ended up with the following.
#!/bin/sh
ZFS=/usr/local/sbin/zfs
CUR_PATH=`realpath $1`
POOL=`$ZFS get -o name -Hp used $CUR_PATH | sed 's|/.*||'`
let USED=`$ZFS get -o value -Hp used $POOL` / 1024 > /dev/null
let AVAIL=`$ZFS get -o value -Hp available $POOL` / 1024 > /dev/null
let TOTAL = $USED + $AVAIL > /dev/null
echo $TOTAL $AVAIL 1024
The script first issues a zfs command to get the name of the filesystem at the samba provided path, strips away everything after the first slash, giving it the pool name. This is needed because filesystems can be mounted at any path, which means we can't parse the provided path. Once it knows the pool, it queries zfs again for the pools used and available values, which are returned by zfs in bytes, divides both by the blocksize and calculates the total pool size by adding both up. Note that unlike the values returned in zpool list, these do not include parity data, which is usually not wanted. As a last step the total, available and blocks size is returned to samba. Because the TrueNAS zfs command is not part of the PATH passed by samba I had to specify the full path to it. Now we need to copy the script to our TrueNAS instance and mark it as executable. The path doesn't really matter as long as it can contain executable files, I choose /usr/local/bin/dfree.
chmod +x /usr/local/bin/dfree
In order to enable the script open your TrueNAS web ui, navigate to Services/SMB, toggle the advanced mode on and add the following code to the Auxiliary Parameters section. I also added the dfree cache time option, which enables caching of the returned values to prevent spawing tons of processes if the size is queried by a lot of clients.
dfree command = /usr/local/bin/dfree
dfree cache time = 60
Save the changes and restart the samba server. Once you open a client you should see all shares report the same used and total size of their underlying pools, regardless of the content inside them.
Because we dynamically detect the parent pool, this even works correctly if the shares are on different pools, as is the case in this example.