I have avoided using NFSv4 for a long time, but ignoring the problem does not mean solving it. Let's check out how to survive with NFSv4 on Linux using CentOS 8 as an example.
NFSv3 does not support extended ACL, but controls file access according to the POSIX UID/GID scheme. This brings us to the problem of NFS clients where local users with mismatched UIDs cannot access files on an NFS share. Moreover, it leads to a security leak between such users. In contrast, NFSv4 supports string-based extended ACLs. The file is owned by USER@DOMAIN (similar to Windows) instead of the UID/GID combination.
As follows from the changes above, the user now has to prove the USER@DOMAIN he is pretending to be. Implemented a Kerberos-based authentication scheme, making NFSv4 more AD-related. It is possible to make v4 work without Kerberos, with the loss of the specified functionality.
Another strange ability has been added to solve a nonexistent problem - it is a pseudo-root filesystem. This is best described with examples.
Stateful version 4 is marked as an advantage over stateless version 3. This notion looks ridiculous in the age of stateless microservices, proving their scalability and redundancy.
Fewer processes and fewer open ports is definitely a big plus for v4. No need for rpcbind and the rest of the family. The new protocol is much more firewall friendly.
Since this POC is using CentOS 8, I used RedHat 3320581 solution as a starting point to set up an NFSv4 only server, so as not to mix its capabilities with other NFS versions. Install the nfs-utils RPM and edit /etc/nfs.conf to explicitly enable all v4 options and disable others:
[nfsd] vers2=n vers3=n vers4=y vers4.0=y vers4.1=y vers4.2=y
RPC is not used in v4, so it should be disabled. As a side effect, this will make the NFSv3 client impossible on that server, but in general, the server is usually not a client.
nfs-server:~ # systemctl mask --now rpc-statd.service rpcbind.service rpcbind.socket nfs-server:~ # systemctl enable --now nfs-server
Add something to exports:
nfs-server:~ # mkdir -p /export/project1/work nfs-server:~ # cat /etc/exports /export/project1 -sec=sys,no_root_squash,sync 192.168.120.0/24(ro) nfs-server:~ # exportfs -a nfs-server:~ # exportfs /export/project1 192.168.120.0/24 nfs-server:~ # systemctl disable --now firewalld.service Removed /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
Alternatively, instead of disabling the firewall, you can add NFS ports on both the server and client side, for example:
# firewall-cmd --list-all # firewall-cmd --add-service=nfs --permanent # firewall-cmd --reload
Install the same nfs-utils package and repeat the same configuration for /etc/nfs.conf. Almost the same services should be disabled/enabled:
nfs-client1:~ # yum install -y nfs-utils nfs-client1:~ # vi /etc/nfs.conf nfs-client1:~ # systemctl mask --now rpc-statd.service rpcbind.service rpcbind.socket Created symlink /etc/systemd/system/rpc-statd.service → /dev/null. Created symlink /etc/systemd/system/rpcbind.service → /dev/null. Created symlink /etc/systemd/system/rpcbind.socket → /dev/null. nfs-client1:~ # systemctl enable --now nfs-client.target nfs-client1:~ # systemctl disable --now firewalld.service Removed /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service. nfs-client1:~ # mount -t nfs4 nfs-server:/export/project1 /mnt nfs-client1:~ # mount .. nfs-server:/export/project1 on /mnt type nfs4 (rw,relatime,vers=4.2,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.120.209,local_lock=none,addr=192.168.120.241)
NOTE: The handy "showmount" tool will not work here because it uses disabled RPC.
Unmount clients and unexport NFS:
nfs-client1:~ # umount /mnt nfs-server:~ # exportfs -ua nfs-server:~ # cat /etc/exports /export/project1 -sec=sys,no_root_squash,sync,fsid=root 192.168.120.0/24(ro) /export/project1/work -sec=sys,no_root_squash,sync nfs-client1(rw) nfs-server:~ # exportfs -a nfs-server:~ # exportfs /export/project1/work nfs-client1 /export/project1 192.168.120.0/24 nfs-client1:~ # mount -t nfs4 nfs-server:/ /mnt nfs-client1:~ # df /mnt Filesystem Size Used Avail Use% Mounted on nfs-server:/ 2.9G 1.2G 1.6G 42% /mnt nfs-client1:~ # ll /mnt/ total 4 drwxr-xr-x. 2 root root 4096 Dec 29 09:50 work
What have we achieved here? The client is unaware of the server's file system structure and simply mounts the /. This probably safer. For example, the export path "/vol/my_volume" might give me a hint that the remote system is NetApp. Let's try to play with this feature some more:
nfs-client1:~ # umount /mnt nfs-client1:~ # mount -t nfs4 nfs-server:/work /mnt nfs-client1:~ # ll /mnt total 0 nfs-client1:~ # df /mnt Filesystem Size Used Avail Use% Mounted on nfs-server:/work 2.9G 1.2G 1.6G 42% /mnt nfs-client1:~ # touch /mnt/client1.file touch: cannot touch '/mnt/client1.file': Read-only file system
The last command demonstrates that the second line in /etc/exports did not work. After doing a few experiments, I discovered that the syntax for /etc/exports does not match the manual. It should be always checked with the exportfs -v command to verify the actual configuration. For example:
nfs-server:~ # exportfs -v /export/project1/work nfs-client1(sync,wdelay,hide,no_subtree_check,sec=sys,ro,secure,no_root_squash,no_all_squash) /export/project1 192.168.120.0/24(sync,wdelay,hide,no_subtree_check,fsid=0,sec=sys,ro,secure,no_root_squash,no_all_squash)
As you can see, a nfs-client1 got ro permissions when it is explicitely sat as rw. The working setup looks like:
nfs-server:~ # cat /etc/exports /export/project1 -sec=sys,no_root_squash,sync,fsid=root 192.168.120.0/24(ro) /export/project1/work -sec=sys,no_root_squash,sync,rw nfs-client1
Let's go back to the pseudo root filesystem. After fact that /export/project1 has been set as the root fsid, it is subtracted from the rest of the exports. The client cannot mount directly /export/project1/work, but only /work.
Another bizarre behavior is that if the exported "work" is a directory into "/export/project1", then the parent rule is applied to it, ignoring explicitly specified rule. You have to mount the new filesystem in this directory to cause the child rule work. Therefore, many examples found on the Internet use the pseudo "mount -o bind". So, I built the following structure for tests on an nfs server:
nfs-server:~ # df | grep export /dev/mapper/rootvg-export 976M 2.6M 907M 1% /export /dev/mapper/rootvg-project1 976M 2.6M 907M 1% /export/project1 /dev/mapper/rootvg-work1 976M 2.6M 907M 1% /export/project1/work /dev/mapper/rootvg-project2 976M 2.6M 907M 1% /export/project2 /dev/mapper/rootvg-work2 976M 2.6M 907M 1% /export/project2/work
And in the end, everything worked as intended:
nfs-client1:~ # mount -t nfs4 nfs-server:/ /mnt nfs-client1:~ # mount -t nfs4 nfs-server:/work /mnt/work nfs-client1:~ # df .. nfs-server:/ 976M 2.5M 907M 1% /mnt nfs-server:/work 976M 2.5M 907M 1% /mnt/work nfs-client1:~ # touch /mnt/client1.file touch: cannot touch '/mnt/client1.file': Read-only file system nfs-client1:~ # touch /mnt/work/client1.file nfs-client1:~ # ls -l /mnt/work/client1.file -rw-r--r--. 1 root root 0 Dec 29 14:37 /mnt/work/client1.file nfs-client1:~ # umount /mnt/work nfs-client1:~ # umount /mnt
Let's check out another use of this pseudo-root function. Imagine that a server family belongs to project1 and another server family belongs to project2. The export file looks like this:
nfs-server:~ # cat /etc/exports /export/project1 -sec=sys,no_root_squash,sync,fsid=root,ro nfs-client1 /export/project1/work -sec=sys,no_root_squash,sync,rw nfs-client1 /export/project2 -sec=sys,no_root_squash,sync,fsid=root,ro nfs-client2 /export/project2/work -sec=sys,no_root_squash,sync,rw nfs-client2
There are two NFS exports as root fsid. This is fine as long as they are exported to different clients. Let's check on both clients:
nfs-client1:~ # mount -t nfs4 nfs-server:/ /mnt nfs-client1:~ # mount -t nfs4 nfs-server:/work /mnt/work nfs-client1:~ # ll /mnt/work/ total 16 -rw-r--r--. 1 root root 0 Dec 29 14:37 client1.file drwx------. 2 root root 16384 Dec 29 11:50 lost+found nfs-client2:~ # mount -t nfs4 nfs-server:/ /mnt nfs-client2:~ # mount -t nfs4 nfs-server:/work /mnt/work nfs-client2:~ # ll /mnt/work/ total 16 drwx------. 2 root root 16384 Dec 29 11:50 lost+found
O! It might finally be useful. You can manage your production and test environments the same way, keeping the differences only on the NFS server side.
I found some examples of how to configure NFSv4 to work with Kerberos. The most solid explanations can be found from Microsoft. It looks like NFSv4 was sponsored by Microsoft to bring the UNIX universe closer.
As for the bottom line - if you want to use files ownership in form USER@DOMAIN as by design, you need to implement sec=krb5. You have to create service accounts in AD for both the server and NFS client, generate a keytab for them, and set them accordingly. Only then can both of them talk to the KDC to verify user information.
This looks very useful when one of the NFS servers or clients is a Microsoft Windows server. But this is overkill for NFS services between two Linux servers.
This option returns the ability to use UID/GID pair between NFS server and client.
Lets investigate how it works:
nfs-server:~ # useradd -g users -u 1000 user1000 nfs-server:~ # useradd -g users -u 1001 user1001
However, on both clients, we will mix users up:
nfs-client1:~ # useradd -g users -u 1000 bob nfs-client1:~ # useradd -g users -u 1001 alice nfs-client2:~ # useradd -g users -u 1000 alice nfs-client2:~ # useradd -g users -u 1001 bob
For this experiment, we don't need two sets of exports, one set will be used by two clients:
nfs-server:~ # exportfs -ua nfs-server:~ # cat /etc/exports /export/project1 -sec=sys,no_root_squash,sync,fsid=root,ro nfs-client1 nfs-client2 /export/project1/work -sec=sys,no_root_squash,sync,rw nfs-client1 nfs-client2 nfs-server:~ # exportfs -a nfs-server:~ # exportfs -v /export/project1 nfs-client1(sync,wdelay,hide,no_subtree_check,fsid=0,sec=sys,ro,secure,no_root_squash,no_all_squash) /export/project1 nfs-client2(sync,wdelay,hide,no_subtree_check,fsid=0,sec=sys,ro,secure,no_root_squash,no_all_squash) /export/project1/work nfs-client1(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash) /export/project1/work nfs-client2(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)
Finally, let's mount on both clients and run some tests:
nfs-client2:~ # mount -t nfs4 nfs-server:/ /mnt nfs-client2:~ # mount -t nfs4 nfs-server:/work /mnt/work nfs-client2:~ # ll /mnt/work/client1.file -rw-r--r--. 1 root root 0 Dec 29 14:37 /mnt/work/client1.file root@nfs-client1:~ # chown bob:users /mnt/work/client1.file root@nfs-client1:~ # ll /mnt/work/client1.file -rw-r--r--. 1 bob users 0 Dec 29 14:37 /mnt/work/client1.file root@nfs-client2:~ # ll /mnt/work/client1.file -rw-r--r--. 1 alice users 0 Dec 29 14:37 /mnt/work/client1.file root@nfs-server:~ # ll /export/project1/work/client1.file -rw-r--r--. 1 user1000 users 0 Dec 29 14:37 /export/project1/work/client1.file
As expected, switching to the old UID/GID scheme returns the same issues - the UID and GID must be the same for all actors in the NFS play. And of course there is no extended ACL in this scheme. This is probably a limitation of the built-in server, because a Google search shows that the netapp and ganesha NFS servers support ACLs when sec=sys. However, I have not tested this yet.