nfs root part 1 – Vallard's Blog

In xCAT we’ve been doing ram-root solutions since 2005. We call it ‘stateless’. Since 2005 there have been a lot of other ‘stateless’ solutions that don’t necessarily match our definition. Fair enough. You can call it what you like. Red Hat for example calls ‘stateless’ an NFS root solution.

We have ignored NFS root for a while because ram-root seemed to be the best for HPC applications. The tradeoffs between NFS root and RAM root are the following:

NFS root = more network traffic

RAM root = less ram to use on your system.

RAM root has been very good to us because these days with Nehalem processors we normally see servers with 24GB of RAM. So to use 300MB at the most (with a bloated InfiniBand stack in it) doesn’t seem that bad.

However, if we consider a hypervisor running a slew of virtual machines, having a bunch of copies of the same thing in memory doesn’t make that much sense. Especially if you consider that if we do an NFS root then the files will be read only. NFS root also allows us to lock down the machine in a way that RAM root doesn’t. But that can create some complexity: NFS root you need to tell which files are read only and which ones can be writable.

xCAT will soon have an NFS root solution. We are examining other implementations to see what we can ‘steal’.

What I have learned is that you can’t just give any standard kernel the nfsroot=<path option>. This is because the proper modules are not all in the kernel to do NFS mounts. So the kernel dies. So we have to give it a ram disk. The secret sauce of xCAT will be in the ramdisk start up file. This is where we can scale, do random waits, and mount everything nicely. We also should put an xCAT client in the initial ramdisk so that we can tell the server where we are. Once you have it all up and going in the ramdisk, the last step is:

exec switch_root -c /dev/console /sysroot /sbin/init

The magic is in between that last step and the kernel loading the initrd.