AWS EFS home directories

From techdocs
Revision as of 14:43, 29 August 2023 by Plinich (talk | contribs) (→‎New World implementation (AWS, cfengine))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page is old and some details are out of date. Use with caution.

Basics

Hosts which can currently access AWS EFS: nw-syd-vx1, nw-syd-vx2, vx2, vx3, vx4, vx5, vx6, vx7, vx8, zappa, williams, wagner, weber, weaver. (11nov2021)

  • Amazon’s EFS service provides an unquotaed, unlimited, NFS-accessible file storage service.
  • Only NFS version 4 (4.1) is available.
  • When configured an EFS service appears as a host (IP address) on a subnetwork.
  • The EFS “host” only responds to connections on TCP port 2049.
  • There is no separate mount daemon, no stat (lock) daemon and no rquota (quota) daemon.
  • There is no RPC service program mapper daemon (rpcbind) so running rpcinfo against the “host” will simply time out.
  • The EFS service has no access control list of its own, no support for netgroups, and accepts connections from all source ports, even unreserved ports.

    Thus, nominally, any arbitrary user process could connect to it and do any operations on the files and directories in the file system without restriction.

  • In practice, access is controlled instead by:
    1. An Amazon Security Group (“nw-syd-efs-access”) to limit which client hosts can connect to the EFS service based on their IP address, and
    2. iptables rules on each client host to prevent outgoing connections from that host to TCP port 2049 from non-root user processes (see -m owner in the iptables extensions).
  • At time of writing there is one EFS service set up in AWS Sydney: nw-syd-efs1.

General implementation

  • Rules are inserted in the netfilter forward/OUTPUT chain on all client hosts to:
    • DENY TCP connection attempts to port 2049 by non-root processes (see also above), and
    • REJECT any UDP or TCP connections to port 111 on the EFS server(s). This is the RPC service mapper port and as there are no RPC services these two rules ensure RPC attempts fail immediately rather than time out (see also man rpcinfo). This is particularly relevant when a CSE client host naively tries to look up disk quota information for a user.
  • As the EFS service backing storage is both homogenous and invisible to mortals, the only real discriminator we can see is the single IP address of the EFS “host”. Thus, having separate file system ID’s (1, 2, 3, A, etc.) is not terribly relevant in real terms. However, for consistency with CSE’s existing naming scheme we can create top-level directories in the single EFS file system with names such as “1”, “2”, etc., and our automounter configurations and support scripts can be coerced into making these appear on clients as /import/<efs-host>/1, /import/<efs-host>/2, etc.
  • Comparing mounts (examples of manual commands to mount on /mnt):
    • mount -t nfs -o vers=3,udp kamen:/export/kamen/1 /mnt (CSE NFS server from Linux client)
    • mount.nfs4 nw-syd-efs1:/1 /mnt -overs=4.1 (EFS server from Linux client)

New World implementation (AWS, cfengine)

  • The New World home directory client implementation uses autofs compared to the old world's amd.
  • Instead of script-based heuristics based on server name, /import network mounts are all static with each possibility listed in /etc/auto.static.
  • The Old World used symbolic links from /home, /web, etc., to the appropriate paths in /import. In the new world these are separate network mounts with their mount parameters (server, type and path) generated on the fly by the /usr/local/bin/autofs_home.sh script using an LDAP query mapping from user name to home directory location. The script determines server type by a heuristic and returns parameters accordingly (such as EFS mount parameters).
  • The list of all home directory exports is maintained by cfengine in /etc/auto_static.cfg on each client host. This is used to [re]create /etc/auto.static (see above) each time the source file changes. See /etc/systemd/system/autofs.service.d/static.conf and /usr/local/bin/autofs_generate_static.shon each client host.
  • See also autofs_mount_options.sh which contains a shell script fragment being the sole bit of code used in multiple other scripts which uses heuristics to convert from server name to mount parameters.
  • Example /usr/local/bin/autofs_home.shscript output:
# /usr/local/bin/autofs_home.sh plinich
-fstype=nfs,vers=4 kamen:/export/kamen/1/plinich
# /usr/local/bin/autofs_home.sh plinich99
-fstype=nfs,vers=4.1 nw-syd-efs1:/1/plinich99
  • We can access the whole EFS file system just by mounting /. E.g.:
# mount.nfs4 nw-syd-efs1:/ /mnt -overs=4.1
  • On the cfengine server (cfengine hub) see:
    • /usr/local/warehouse/iptables.v.all/
    • /usr/local/warehouse/autofsconf.v.1/
    • /usr/local/warehouse/automounter.v.all/
    • /var/lib/cfengine3/masterfiles/iptables.inc
    • /var/lib/cfengine3/masterfiles/automounterconf.inc
  • Side note: on all Linux NFS clients, NFS functionality in the kernel is provided by kernel modules which are loaded when first required. NFSv3 functionality (as required for the K17 NFS servers) is implemented in one kernel module and NFSv4 functionality (AWS EFS servers) in a separate module. During testing of EFS there was a kernel upgrade rolled out to K17 hosts and this meant NFSv4 mounts wouldn't work while NFSv3 mounts would. This was because there hadn't been a reboot and the NFSv4 kernel modules for the [old] kernel had been removed and so couldn't be loaded when NFSv4 (EFS) mounts were attempted. The [old] NFSv3 module had already been loaded before the new kernel was rolled out which meant that NFSv3 mounts did not have this problem. A reboot with the new kernel fixed the problem and NFSv4 mounts started working.

Old world implementation (K17, conform)

  • Read the first few points of New world implementation (above) for background.
  • /etc/init.d/iptables_rules_for_aws_efs installs the netfilter rules mentioned above.
  • In CSE/K17-land /import mounts are dynamic rather than static (as in AWS). The mount parameters are generated on the fly by /usr/sbin/amd_import.sh on each client host when required.
  • Mount parameters for /home, /web, et al, are generated on the fly by the /usr/sbin/amd_home binary. The only true mounts are generated for /import paths. The ones mentioned in this bullet point are created as symlinks into /import.
  • Example script outputs:
# /usr/sbin/amd_home plinich
type:=link;fs:=/import/kamen/1/plinich
# /usr/sbin/amd_home plinich99
type:=link;fs:=/import/nw-syd-efs1/1/plinich99
# /usr/sbin/amd_import.sh nw-syd-efs1/1
type:=program;mount="/sbin/mount.nfs4 mount.nfs4 nw-syd-efs1:/1 ${fs} -overs=4.1"

Note the explicit mount command used with the EFS host. This is because amd only supports NFSv2 and NFSv3. NFSv4 is required for EFS.

  • conform product: Local/mount.v.3