New World Backup System

From techdocs
Revision as of 14:38, 5 February 2024 by Zain (talk | contribs)
Jump to navigation Jump to search

New World Backup System

The NW (New World) Backup System is built around rsnapshot(1) as its underlying technology, while

rsnapshot is a remote filesystem snapshot utility that uses rsync(1), and
rsync is a fast and versatile, remote (and local) file and directory copying tool that
  • Handles local system snapshots directly, and
  • Handles remote system snapshots using ssh(1).

The CSE backup system is run on the New World AWS machine: nw-syd-backup1.

The scripts, logs, and archives associated with NW backups are generally found in the directory nw-syd-backup1:/export/nw-syd-backup1/1/backups/, which will be referred to as ~backups/

Backup script

The main backup script is ~backups/bin/backup

Usage
backup[-h] [-l log] [-n] [-q] command [commandargs]
Function
This script deals with CSE rsnapshot backups made for any CSE user who owns a home, or other directory, that is hosted on a CSE file server.
This script should usually be run on the CSE backup server: nw-syd-backup1.
Options
-h [ command | topic ]
Print help for the specific command or topic passed. If no command or topic is passed, print this general help and exit.
Topics and commands are summarised below.
-l logfile
Log messages to logfile (Default: ~backups/log/)
-n
Do not make any changes - just report what would be done.
-q
Do not reproduce messages on STDERR.
command [command_args]
Run the backup command with its optional command arguments.
Backup commands are summarised below.
Full descriptions of commands and their args may be produced by running:backups -h command
Topics
general
Usage: backups [-h] [-l log] [-n] [-q] command [commandargs]
locations
Locations used by the CSE Backups System
overview
Overview of CSE Backup System
Commands
diff
User related diff.
fixprimary
Fix primary home archive names if necessary.
mkspec
Prepare backup archive directories and specfiles for users.
movearch
Copy backup archives to another backup filesystem.
movesource
Change the recorded source directory for a user's backup archive.
purge
Shifts and/or deletes archives of expired users.
rename
Renames archive(s) of a CSE user whose username has changed.
report
Summarise logs to report on the last backups.
resurrect
Resurrect a user's inactive backup archive from deleted archives.
run
Create and/or update CSE New World user backup archives.

Version: 2023.11


Overview

The CSE Backups use rsnapshot(1) as its underlying technology.

  1. A separate rsnapshot archive is kept of each CSE directory belonging to every CSE user.
    These archives are accessible via the directory:
    $BU_USERS/$LASTTWO/$user[.n]/
    where:
    • $user is the name of the CSE user whose directory is stored in the archive.
    • $LASTTWO is the last two characters of the user's CSE username,
    • [.n] is a unique numeric suffix assigned to each of the user's different CSE directories (if the user has more than one).
  2. The copies (or snapshots) of the user's directory that were made N days ago is found in the user's archive directory \$user[.n]/, under
    the directories named 'daily.N/'. There are a maximum of 30 daily snapshots of the user's directory stored in each archive, one for each day since the snapshot it contains was made.
    These 'daily.N' directories are renamed (ie: shifted up) each day that they age, until finally the copy made 30 days ago (named 'daily.30/') is deleted.
  3. The path: 'basename/export/hostname/partition/basename/' is inserted by rsnapshot between the directory 'daily.N/' and the actual snapshot of the user's directory, and is used to uniquely identify the archive's origin.
    'basename' will usually be the same as the username.
    The file $user[.n]/.bu_source is also used to identify the archive's origin.
  4. The same access (ie: ownership and permissions) are assigned to each user's backup archive as are assigned to their original CSE directory.
  5. Once a user's CSE account has expired, and/or the user's CSE directory has been removed from its host server, the user's archive of their directory will also be moved aside.
    However, the daily.N/ directories will continue to be shifted up daily until they have all been removed.
    If the user's CSE directory is restored to the CSE host sever before all 'daily.N/' dirs have been removed, then the remaining daily.[N..M]/ directories will be renamed daily[0..M-N], and made available to the user once more.

Locations

CSE's backup server
nw-syd-backup1@cse.unsw.edu.au
~backups/users/$LASTTWO/$user[.n]/
This is the location of $user's rsnapshot archives, such that
  • $user is the name of the CSE user whose directory is stored in the archive.
  • $LASTTWO is the last two characters of the user's CSE username,
  • If the user has more than one CSE directory being backed up, then [.n] is a unique numeric suffix assigned to all but one of the user's CSE directories.
The archive without the numeric suffix should always be the archive that stores the user's main home directory.
~backups/users/$LASTTWO/$user[.n]/.bu_source
identifies the source of the specific user's directory stored in that rsnapshot archive.
~backups/users/$LASTTWO/.deleted/$user[.n]/
If $user's source directory, as identified by the file: ./$user[.n]/.bu_source, no longer exists on the source filesystem, then the user's archive is moved into this .deleted/ directory.
~backups/disks/$mountpoint/$LASTTWO/\$user[.n]/
The actual storage location of the backup archives.
There may be many different $mountpoints on which different disks may be mounted.
The directory: ~backups/users/$LASTTWO/$user[.n]/ is actually a symlink to the actual directory in ~backups/disks/$mountpoint/$LASTTWO/$user[.n] that has been allocated to this user's backup archive.
~backups/var/
The location of various files containing (user/dir) data.
These files group (user/dir) data into different classifications which are used to determine which (user/dir)s are to be snapshotted, and how their archives are to be managed.
~backups/lib/
The location of most of the configuration files used by the NW backup system.
~backups/bin/:The location of most of the scripts used by the backup system, including the main backup script itself (described in the next section).

Backup script

The main backup script is called: ~backups/bin/backup

This script contains most of the commands used by the backup system, each of which may be run by passing the command and its arguments to this script.

Usage
backup[-h] [-l log] [-n] [-q] command [commandargs]
Function
This script deals with CSE rsnapshot backups made for any CSE user who owns a home, or other directory, that is hosted on a CSE file server.
This script should usually be run on the CSE backup server: nw-syd-backup1.
Options
-h [ command | topic ]
Print help for the specific command or topic passed. If no command or topic is passed, print this general help and exit.
Topics and commands are summarised below.
-l logfile
Log messages to logfile (Default: ~backups/log/)
-n
Do not make any changes - just report what would be done.
-q
Do not reproduce messages on STDERR.
command [command_args]
Run the backup command with its optional command arguments.
Backup commands are summarised below.
Full descriptions of commands and their args may be produced by running:backups -h command
Topics
general
Usage: backups [-h] [-l log] [-n] [-q] command [commandargs]
locations
Locations used by the CSE Backups System
overview
Overview of CSE Backup System
Commands
diff
User related diff.
fixprimary
Fix primary home archive names if necessary.
mkspec
Prepare backup archive directories and specfiles for users.
movearch
Copy backup archives to another backup filesystem.
movesource
Change the recorded source directory for a user's backup archive.
purge
Shifts and/or deletes archives of expired users.
rename
Renames archive(s) of a CSE user whose username has changed.
report
Summarise logs to report on the last backups.
resurrect
Resurrect a user's inactive backup archive from deleted archives.
run
Create and/or update CSE New World user backup archives.

Version: 2023.11

backup commands

COMMANDS=(

   [run]="Usage:
   $ME [-genopts] run [-m MAXSPEC] [-N[123]] [-n]

[-p MAXPROC] [-s SCRIPT] [-u user]

Function:

   Create and/or update CSE New World user backup archives.
   (1)	Create lists of all CSE (user/directories), classify and
   	fix primary homes if necessary (See: '-h fixprimary')

(files in $BU_VAR/)

   (2a) Ensure a backup archive exists for each (user/dir).

(archives in $BU_USERS/)

   (2b) Create rsnapshot specfiles for each (user/dir).

(specfiles in $BU_TMP/)

   (3) Use xargs to run consecutive invokations of SCRIPT,
   	passing a different specfile to each process each time.

(See '-s SCRIPT' below) Options:

   -D	Run an xargs process for each physical disk storing archives.

This attempts to evenly distribute simultaneous disk activity across all physical disks. Default: Only run one xargs process.

   -m MAXSPEC

Each xargs process will call SCRIPT no more than MAXSPEC times. Default: Call SCRIPT until all specfiles are processed.

   -N1	Do NOT (re)create user dir lists.

(ie: Do not run step 1)

   -N2 Do NOT (re)create rsnapshot dirs or specfiles.

(ie: Do not run steps 2a, 2b)

   -N3 Do NOT use xargs to call SCRIPT at all.

(ie: Do not run step 3).

   -n	Equivalent to -N3 or '-m 0'.
   -p MAXPROC

Each xargs runs MAXPROC simultaneous process sequences of SCRIPT. (Default: with '-D': $BU_MAX_INSTANCES_D without '-D': $BU_MAX_INSTANCES)

   -s SCRIPT

xargs consecutively invokes this SCRIPT (in step 3), passing SCRIPT another specfile each time. Default SCRIPT - $CALL_RSNAPSHOT: 1) Passes the specfile to rsnapshot with the required options. 2) Removes the specfile after rsnapshot has run. Note: SCRIPT may be any program. It need not run rsnapshot, nor must it do anything with the specfile passed.

   -u user

Create the specfiles for just this user (unless '-N2'), Run SCRIPT passing just user's specfiles (unless '-n') (Default: Create and run specfiles of all CSE users). "

   [mkspec]="Usage:
   $ME [-genopts] mkspec [-D] [-nfs] [file]

Function:

   Prepare backup archive directories and specfiles for users.
   Read the input file of format: 'directory user host',
   and for each (user/dir) for which host is defined:
   (1) Find or create an rsnapshot archive for (user/dir):

($BU_USERS/\$LASTTWO/\$user[.n]/)

   (2) Create individualised rsnapshot specfiles

(By default in $BU_TMP/)

   If host is not defined, but an rsnapshot archive has been

found for this (user/dir), then move the archive dir to $BU_USERS/\$LASTTWO/$DELETED_DIR/\$user[.n].

   If host is not defined, but no rsnapshot archive is found,

then silently ignore the (user/dir).

   This is the function called by the 'run' command to perform
   its steps (2a) and (2b). (See '$ME -h run')

Options:

   -D	 Create the specfiles in $BU_TMP/disk.N/ depending

on where the archive is stored.

   -nfs The rsnapshot specfile specified that the directory

should be accessed over NFS, rather than via an SSH connection to host (which is the default).

   file This is the file of format: 'directory user [host]'.

If host is present then the directory exists on host. (Default file: $BU_VAR/$D_SSH) "

   [report]="Usage:
   $ME [-genopts] report [-d date]

Function:

   Summarise logs to report on the last backups.

Options:

   -d date

Report on all backups run on this date. Date is specified as yyyy.mm.dd

   *NOT YET IMPLIMENTED*

"

   [diff]="Usage:
   $ME [-genopts] diff user

Function:

   User related diff.
   *NOT YET IMPLIMENTED*

"

   [movearch]="Usage:
   $ME movearch (archname tofs | -f file)

Function:

   Copy backup archives to another backup filesystem.
   The backup archive archname is copied
   from: $BU_USERS/\$LASTTWO/\$archname

which is a soft link to its actual location at: \$FROMFS/\$LASTTWO/\$archname

   to:	\$tofs/\$LASTTWO/\$archname.
   If the copy is successfull, the soft link is adjusted to the new location.

Options:

   archname	The name of the specific archive to be moved
   		This is usually of the form: 'username[.[0-9]]'
   tofs	Full pathname of the destination filesystem

(eg: $BU_BACKUP/disks/3)

   -f file	Read file containing lines: 'archname tofs',

and copy and move each archive to tofs. Note: file of '-' will cause the script to read from STDIN.

   Default:	(no options or args) Read from STDIN. (ie: '-f -' )

Note:

   This command does not change the recorded source of the archive.
   If you have moved the source of the archive (ie: the location of the
   user's home dir) from one home directory server/fs to another,
   then use the movesource command.
   See: '$ME -h movesource'

"

   [movesource]="Usage:
   $ME movesource user olddir newdir

Function:

   Change the recorded source directory for a user's backup archive.

Details:

   CSE Backup archives record their source directory in two places:
   (1) $BU_USERS/\$LASTTWO/\$username[.n]/$BU_SOURCEDIR

This record is maintained by '$ME'.

   (2) In each retention level archive pathname:
   	$BU_USERS/\$LASTTWO/\$username[.n]/daily.N/\$username/\$source_dir/
       This source_dir pathname is maintained by 'rsnapshot'.

Use:

   Run this command when a user's home (or other directory) has
   had its pathname changed in any way, either because:
   (a) Some directories in the pathname were renamed (but the contents were
       not physically relocated), or
   (b) The directory and its contents were relocated/copied from one disk
       server's file system to another, resulting in a pathname change.

Example:

   $ME movesource zain /import/kamen/1/zain /import/glass/A/zain

Note:

   This command only changes the archive source records in (1) and (2) above.
   (a) or (b) are expected to be done separately elsewhere.
   If \$username has also changed, and not just the source directory's
   pathname, then you should also use the 'rename' command.
   (See: '$ME -h rename')

"

   [rename]="Usage:
   $ME rename oldusername newusername

Function:

   Renames archive(s) of a CSE user whose username has changed.

Details:

   In general, backup archives belonging to username have the following
   access pathname:

$BU_USERS/\$LASTTWO/\$username[.n]/daily.N/\$username/\$source_pathname/

   This command changes all occurences of \$username in such access pathnames
   from 'oldusername' to 'newusername', also changing \$LASTTWO where
   necessary.

Note:

   This command will not change any further occurences of \$username
   within \$source_pathname. If \$source_pathname has changed in any
   way, then you should also use the 'movesource' command.
   (See: '$ME -h movesource').

"

   [resurrect]="Usage:
   $ME resurrect user [sourcedir]

Function:

   Resurrect a user's inactive backup archive from deleted archives.
   If sourcedir is passed, only resurrect the user's inactive backup archive
   coming from sourcedir, otherwise resurrect all inactive backup archives
   belonging to user.

Details:

   Active CSE users have backup archives kept in:
   (a)	$BU_USERS/\$LASTTWO/username[.n]/
   Expired CSE users have their backup archives made inactive and moved to:
   (b)	$BU_USERS/\$LASTTWO/$DELETED_DIR/username[.n]/
   If an expired CSE user has their account reactivated, then this command
   reactivates their backup archive by moving it from (b) to (a).

Note:

   This command only resurrects the backup archive (if it exists), so
   that the user's home directory(s) may be backed up once more.
   It does not restore the user's original home directory from the backup
   archive. Run the command 'restore' to do this.
   The source of the resurrected backup archives are assumed to stay unchanged.
   If the user's original directory(s) is restored into different
   file systems/sourcedirs, in addition to resurrecting the user's original
   archive, you may need to use the 'movesource' and/or 'rename' command.
   

See Also:

   '$ME -h purge'		- for what happens to deleted archives.
   '$ME -h movesource'	- for moving archive sources.
   '$ME -h rename'		- for changing user names.
   '$ME -h restore'	- for restoring user directory from archives.

"

   [purge]="Usage:
   $ME [-genopts] purge [-u user] [-m MAX] [-n]

Function:

   Shifts and/or deletes archives of expired users.
   This command:
   a) Shifts all inactive/deleted archive directories stored in:

$BU_USERS/\$LASTTWO/$DELETED_DIR/

   b) Removes inactive/deleted archives with no retention directories left.

Options:

   -u user

Just shift and/or purge rsnapshot archives belonging to this user. (Default: All user archives in $DELETED_DIR/ directories)

   -m MAX

Only shift and/or purge MAX users.

   -n	Do not actually shift or remove any rsnapshot archives.

Just list deleted archives on STDOUT.

   -v	verbose listing - include disk usage

Background:

   When CSE users are expired, their backup archives are moved from

$BU_USERS/\$LASTTWO/user[.n]/

   to	$BU_USERS/\$LASTTWO/$DELETED_DIR/user[.n]/
   This command shifts the retention directories in these expired user's
   backup archives and removes them when they eventually become empty.

"

   [fixprimary]="Usage:
   $ME	fixprimary

Function:

   Fix primary home archive names if necessary.
   For every CSE user with a CSE home and a CSE archive, ensure the
   user's primary home directory is stored in their primary user archive
   (ie: not stored in an archive with a '.N' suffix).
   This command uses renamearch to rename archives if necessary.