New World Backup System: Difference between revisions
Line 98: | Line 98: | ||
:;resurrect : Resurrect a user's inactive backup archive from deleted archives. | :;resurrect : Resurrect a user's inactive backup archive from deleted archives. | ||
:;run : Create and/or update CSE New World user backup archives. | :;run : Create and/or update CSE New World user backup archives. | ||
===run=== | |||
This is the main backup command, which actually runs the rsnapshot backups for each CSE user. | This is the main backup command, which actually runs the rsnapshot backups for each CSE user. | ||
;Usage: <code>backup [-genopts] run [-m MAXSPEC] [-N[123]] [-n] [-p MAXPROC] [-s SCRIPT] [-u user]</code> | ;Usage: <code>backup [-genopts] run [-m MAXSPEC] [-N[123]] [-n] [-p MAXPROC] [-s SCRIPT] [-u user]</code> | ||
Line 137: | Line 136: | ||
::(Default: Create and run specfiles of all CSE users). | ::(Default: Create and run specfiles of all CSE users). | ||
===mkspec=== | |||
;Usage: backup [-genopts] mkspec [-D] [-nfs] [file] | ;Usage: backup [-genopts] mkspec [-D] [-nfs] [file] | ||
;Function: Prepare backup archive directories and specfiles for users. | ;Function: Prepare backup archive directories and specfiles for users. | ||
Line 157: | Line 156: | ||
If host is present then the directory exists on host. | If host is present then the directory exists on host. | ||
(Default file: $BU_VAR/$D_SSH) | (Default file: $BU_VAR/$D_SSH) | ||
===report=== | |||
Usage: | Usage: | ||
$ME [-genopts] report [-d date] | $ME [-genopts] report [-d date] | ||
Line 170: | Line 169: | ||
*NOT YET IMPLIMENTED* | *NOT YET IMPLIMENTED* | ||
===diff=== | |||
Usage: | Usage: | ||
$ME [-genopts] diff user | $ME [-genopts] diff user | ||
Line 208: | Line 207: | ||
then use the movesource command. | then use the movesource command. | ||
See: '$ME -h movesource' | See: '$ME -h movesource' | ||
===movesource=== | |||
Usage: | Usage: | ||
$ME movesource user olddir newdir | $ME movesource user olddir newdir | ||
Line 241: | Line 240: | ||
pathname, then you should also use the 'rename' command. | pathname, then you should also use the 'rename' command. | ||
(See: '$ME -h rename') | (See: '$ME -h rename') | ||
===rename=== | |||
Usage: | Usage: | ||
$ME rename oldusername newusername | $ME rename oldusername newusername | ||
Line 262: | Line 261: | ||
way, then you should also use the 'movesource' command. | way, then you should also use the 'movesource' command. | ||
(See: '$ME -h movesource'). | (See: '$ME -h movesource'). | ||
===resurrect=== | |||
Usage: | Usage: | ||
$ME resurrect user [sourcedir] | $ME resurrect user [sourcedir] | ||
Line 297: | Line 296: | ||
'$ME -h rename' - for changing user names. | '$ME -h rename' - for changing user names. | ||
'$ME -h restore' - for restoring user directory from archives. | '$ME -h restore' - for restoring user directory from archives. | ||
===purge=== | |||
Usage: | Usage: | ||
$ME [-genopts] purge [-u user] [-m MAX] [-n] | $ME [-genopts] purge [-u user] [-m MAX] [-n] | ||
Line 325: | Line 324: | ||
This command shifts the retention directories in these expired user's | This command shifts the retention directories in these expired user's | ||
backup archives and removes them when they eventually become empty. | backup archives and removes them when they eventually become empty. | ||
===fixprimary=== | |||
Usage: | Usage: | ||
$ME fixprimary | $ME fixprimary |
Revision as of 10:31, 7 February 2024
Overview
- The NW (New World) Backup System is built around rsnapshot(1) as its underlying technology
- rsnapshot is a remote filesystem snapshot utility that uses rsync(1);
- rsync is a fast and versatile, remote (and local) file and directory copying tool that
- Handles local system snapshots directly, and
- Handles remote system snapshots using ssh(1).
- The CSE backup system is run on the New World AWS machine: nw-syd-backup1.
- The scripts, logs, and archives associated with NW backups are generally found in the directory
nw-syd-backup1:/export/nw-syd-backup1/1/backups/
, which will be referred to as~backups/
Operation
- A separate rsnapshot archive is kept of each CSE directory belonging to every CSE user.
- These archives are accessible via the directory:
~backups/users/$LASTTWO/$user[.n]/
- where:
- $LASTTWO is the last two characters of the user's CSE username,
- $user is the name of the CSE user whose directory is stored in the archive.
- [.n] is a unique numeric suffix assigned to each of the user's different CSE directories (if the user has more than one).
- The copy (or snapshot) of the user's directory that was made N days ago, is found in the user's archive directory
$user[.n]/
, under the directory named:daily.N/
.- There are a maximum of 30 daily snapshots of the user's directory stored in each archive, one for each day since the snapshot it contains was made.
- These daily.N/ directories are renamed (ie: shifted up) each day that they age, until finally the copy made 30 days ago (named daily.30/) is deleted.
- The path:
basename/export/hostname/partition/basename/
is always inserted by rsnapshot between each directorydaily.N/
and the actual snapshot/copy of the user's directory.- There does not seem to be a way of avoiding this pathname being inserted by rsnapshot given the particular organisation of user archives we use in the NW backups.
- basename will usually be the same as the username.
- Although this pathname may be used to uniquely identify the archive's origin, the file
$user[.n]/.bu_source
is more conveniently used to do this.
- The same access (ie: ownership and permissions) is assigned to each file and directory in the user's backup archive as were assigned to each file and directory in their original CSE (home) directory.
- Note however:
- The filesystems storing the CSE archives are mounted read only outside of the backup server,
- This means that no user can change the contents or access permissions of any file or directory in their archive, even if they seem to have permissions to do so.
- This means that it may be possible for users to not be allowed access to files or directories in their archive, if they were originally copied with the wrong access permissions.
- In this case, the users will need to get the help of the system staff (root) to access these files/directories.
- Note however:
- Once a user's CSE account has expired, and/or the user's CSE directory has been removed from its host server, the user's archive of their directory will also be moved aside
- from
~backups/users/$LASTTWO/$user[.n]/
- to
~backups/users/$LASTTWO/.deleted/$user[.n]/
. - However:
- The daily.N/ directories will continue to be shifted up daily until they have all been removed.
- If the user's CSE directory is restored to the CSE host sever before all 'daily.N/' dirs have been removed, then the remaining daily.[N..M]/ directories will be renamed daily[0..M-N], and made available to the user once more.
- from
Locations
- CSE's backup server
nw-syd-backup1@cse.unsw.edu.au
- ~backups
nw-syd-backup1:/export/nw-syd-backup1/1/backup/
- ~backups/users/$LASTTWO/$user[.n]/
- This is the location of $user's rsnapshot archives, such that:
- $LASTTWO is the last two characters of the user's CSE username,
- $user is the name of the CSE user whose directory is stored in the archive.
- If the user has been allocated more than one CSE directory, then [.n] is a unique numeric suffix assigned to all but one of the user's CSE directories.
- The archive without the numeric suffix should always be the archive that stores the user's main home directory.
- $user[.n] is actually a soft link to the actual storage location of the backup archive:
backups/disks/$mountpoint/$LASTTWO/$user[.n]
- ~backups/users/$LASTTWO/$user[.n]/.bu_source
- This file identifies the source of the actual user's directory that is stored in this rsnapshot archive. It will mpst likely be of the form:
/import/bach/1/username
- ~backups/users/$LASTTWO/.deleted/$user[.n]/
- If $user's source directory, as identified by the file: $user[.n]/.bu_source, no longer exists on the source filesystem, then the user's archive is moved into this .deleted/ directory.
- ~backups/disks/$mountpoint/$LASTTWO/$user[.n]/
- This is the actual storage location of the backup archive for $user[.n].
- There may be many different $mountpoint each of which represents a separate (possibly virtual) disk storage filesystem on which user's archives may be stored. This provides a way of managing disk space allocation.
- The directory: ~backups/users/$LASTTWO/$user[.n]/ is actually a symlink to the actual directory in ~backups/disks/$mountpoint/$LASTTWO/$user[.n] that has been allocated to this user's backup archive.
- ~backups/var/
- The location of various files containing (user/dir) data.
- These files group (user/dir) data into different classifications which are used to determine which (user/dir)s are to be snapshotted, and how their archives are to be managed.
- ~backups/lib/
- The location of most of the configuration files used by the NW backup system.
- ~backups/bin/
- The location of most of the scripts used by the backup system, including the main backup script itself (described in the next section).
Scripts
runbackup.sh
This is the main script calling all the other scripts together that relate to backups.
- Location
~backups/bin/runbackups.sh
- Called by
- /etc/cron.d/rsnapshot
backup
This is the main script that contains most of the bespoke commands used by the backup system.
- Location
~backups/bin/backup
- Called by
- runbackups.sh
Many of the commands within backup can be run individually by passing them as arguments to backup, along with their desired options.
The backup -h option produces help documentation describing backup and its constituent commands. This documentation is also duplicated below, with additional details where this might be helpful.
Man Page
- Usage
backup [-h] [-l log] [-n] [-q] command [commandargs]
- Function
- This script deals with CSE rsnapshot backups made for any CSE user who owns a home, or other directory, that is hosted on a CSE file server.
- This script should usually be run on the CSE backup server: nw-syd-backup1.
- Options
-
- -h [ command | topic ]
- Print help for the specific command or topic passed. If no command or topic is passed, print this general help and exit.
- Topics and commands are summarised below.
- -l logfile
- Log messages to logfile (Default: ~backups/log/)
- -n
- Do not make any changes - just report what would be done.
- -q
- Do not reproduce messages on STDERR.
- command [command_args]
- Run the backup command with its optional command arguments.
- Backup commands are summarised below.
- Full descriptions of commands and their args may be produced by running:
backups -h command
and are also included in what follows, under their own subsections.
- Topics
-
- general
- Usage: backups [-h] [-l log] [-n] [-q] command [commandargs]
- locations
- Locations used by the CSE Backups System
- overview
- Overview of CSE Backup System
- Commands
-
- diff
- User related diff.
- fixprimary
- Fix primary home archive names if necessary.
- mkspec
- Prepare backup archive directories and specfiles for users.
- movearch
- Copy backup archives to another backup filesystem.
- movesource
- Change the recorded source directory for a user's backup archive.
- purge
- Shifts and/or deletes archives of expired users.
- rename
- Renames archive(s) of a CSE user whose username has changed.
- report
- Summarise logs to report on the last backups.
- resurrect
- Resurrect a user's inactive backup archive from deleted archives.
- run
- Create and/or update CSE New World user backup archives.
run
This is the main backup command, which actually runs the rsnapshot backups for each CSE user.
- Usage
backup [-genopts] run [-m MAXSPEC] [-N[123]] [-n] [-p MAXPROC] [-s SCRIPT] [-u user]
- Function
- Create and/or update CSE New World user backup archives.
- Create lists of all CSE (user/directories) in ~backups/var/.
- Classify these directory sources
- Fix primary homes if necessary (See: fixprimary command below)
- Create specfiles
- Ensure a backup archive exists for each (user/dir).
- (archives in $BU_USERS/)
- Create rsnapshot specfiles for each (user/dir).
- (specfiles in /var/tmp/backups/)
- Ensure a backup archive exists for each (user/dir).
- Use xargs to run consecutive invokations of SCRIPT, passing a different specfile to each process each time.
- (See '-s SCRIPT' option below)
- Create lists of all CSE (user/directories) in ~backups/var/.
- Options
-
- -D
- Run an xargs process for each physical disk storing archives.
- This attempts to evenly distribute simultaneous disk activity across all physical disks.
- Default: Only run one xargs process, and ignore details of physical storage disks.
- -m MAXSPEC
- Each xargs process will call SCRIPT no more than MAXSPEC times.
- Default: Call SCRIPT until all specfiles are processed.
- -N1
- Do NOT (re)create user dir lists. (ie: Do not run Function step 1)
- -N2
- Do NOT (re)create rsnapshot dirs or specfiles. (ie: Do not run Function steps 2.1, 2.2)
- -N3
- Do NOT use xargs to call SCRIPT at all. (ie: Do not run Function step 3).
- -n
- Equivalent to -N3 or '-m 0'.
- -p MAXPROC
- Each xargs runs MAXPROC simultaneous process sequences of SCRIPT.
- Default:
- With '-D': 1;
- Without '-D': 6
- -s SCRIPT
- xargs consecutively invokes this SCRIPT (in step 3),
- passing SCRIPT another specfile each time.
- Default SCRIPT -
~backups/bin/run_rsnapshot
:- Passes the specfile to rsnapshot with the required options.
- Removes the specfile after rsnapshot has run.
- Note: SCRIPT may be any program. It need not run rsnapshot, nor must it do anything with the specfile passed.
- -u user
- Create the specfiles for just this user (unless '-N2'),
- Run SCRIPT passing just user's specfiles (unless '-n')
- (Default: Create and run specfiles of all CSE users).
mkspec
- Usage
- backup [-genopts] mkspec [-D] [-nfs] [file]
- Function
- Prepare backup archive directories and specfiles for users.
- Read the input file of format: 'directory user host', and for each (user/dir) for which host is defined:
- Find or create an rsnapshot archive for (user/dir): ($BU_USERS/\$LASTTWO/\$user[.n]/)
- Create individualised rsnapshot specfiles (By default in $BU_TMP/)
- If host is not defined, but an rsnapshot archive has been found for this (user/dir), then move the archive dir to $BU_USERS/\$LASTTWO/$DELETED_DIR/\$user[.n].
- If host is not defined, but no rsnapshot archive is found, then silently ignore the (user/dir).
- Read the input file of format: 'directory user host', and for each (user/dir) for which host is defined:
This is the function called by the 'run' command to perform its steps (2a) and (2b). (See '$ME -h run')
Options:
-D Create the specfiles in $BU_TMP/disk.N/ depending
on where the archive is stored.
-nfs The rsnapshot specfile specified that the directory
should be accessed over NFS, rather than via an SSH connection to host (which is the default).
file This is the file of format: 'directory user [host]'.
If host is present then the directory exists on host. (Default file: $BU_VAR/$D_SSH)
report
Usage:
$ME [-genopts] report [-d date]
Function:
Summarise logs to report on the last backups.
Options:
-d date
Report on all backups run on this date. Date is specified as yyyy.mm.dd
*NOT YET IMPLIMENTED*
diff
Usage:
$ME [-genopts] diff user
Function:
User related diff.
*NOT YET IMPLIMENTED*
"
[movearch]="Usage: $ME movearch (archname tofs | -f file)
Function:
Copy backup archives to another backup filesystem.
The backup archive archname is copied from: $BU_USERS/\$LASTTWO/\$archname
which is a soft link to its actual location at: \$FROMFS/\$LASTTWO/\$archname
to: \$tofs/\$LASTTWO/\$archname. If the copy is successfull, the soft link is adjusted to the new location.
Options:
archname The name of the specific archive to be moved This is usually of the form: 'username[.[0-9]]' tofs Full pathname of the destination filesystem
(eg: $BU_BACKUP/disks/3)
-f file Read file containing lines: 'archname tofs',
and copy and move each archive to tofs. Note: file of '-' will cause the script to read from STDIN.
Default: (no options or args) Read from STDIN. (ie: '-f -' )
Note:
This command does not change the recorded source of the archive. If you have moved the source of the archive (ie: the location of the user's home dir) from one home directory server/fs to another, then use the movesource command. See: '$ME -h movesource'
movesource
Usage:
$ME movesource user olddir newdir
Function:
Change the recorded source directory for a user's backup archive.
Details:
CSE Backup archives record their source directory in two places: (1) $BU_USERS/\$LASTTWO/\$username[.n]/$BU_SOURCEDIR
This record is maintained by '$ME'.
(2) In each retention level archive pathname: $BU_USERS/\$LASTTWO/\$username[.n]/daily.N/\$username/\$source_dir/ This source_dir pathname is maintained by 'rsnapshot'.
Use:
Run this command when a user's home (or other directory) has had its pathname changed in any way, either because: (a) Some directories in the pathname were renamed (but the contents were not physically relocated), or (b) The directory and its contents were relocated/copied from one disk server's file system to another, resulting in a pathname change.
Example:
$ME movesource zain /import/kamen/1/zain /import/glass/A/zain
Note:
This command only changes the archive source records in (1) and (2) above. (a) or (b) are expected to be done separately elsewhere.
If \$username has also changed, and not just the source directory's pathname, then you should also use the 'rename' command. (See: '$ME -h rename')
rename
Usage:
$ME rename oldusername newusername
Function:
Renames archive(s) of a CSE user whose username has changed.
Details:
In general, backup archives belonging to username have the following access pathname:
$BU_USERS/\$LASTTWO/\$username[.n]/daily.N/\$username/\$source_pathname/
This command changes all occurences of \$username in such access pathnames from 'oldusername' to 'newusername', also changing \$LASTTWO where necessary.
Note:
This command will not change any further occurences of \$username within \$source_pathname. If \$source_pathname has changed in any way, then you should also use the 'movesource' command. (See: '$ME -h movesource').
resurrect
Usage:
$ME resurrect user [sourcedir]
Function:
Resurrect a user's inactive backup archive from deleted archives. If sourcedir is passed, only resurrect the user's inactive backup archive coming from sourcedir, otherwise resurrect all inactive backup archives belonging to user.
Details:
Active CSE users have backup archives kept in: (a) $BU_USERS/\$LASTTWO/username[.n]/ Expired CSE users have their backup archives made inactive and moved to: (b) $BU_USERS/\$LASTTWO/$DELETED_DIR/username[.n]/
If an expired CSE user has their account reactivated, then this command reactivates their backup archive by moving it from (b) to (a).
Note:
This command only resurrects the backup archive (if it exists), so that the user's home directory(s) may be backed up once more. It does not restore the user's original home directory from the backup archive. Run the command 'restore' to do this.
The source of the resurrected backup archives are assumed to stay unchanged. If the user's original directory(s) is restored into different file systems/sourcedirs, in addition to resurrecting the user's original archive, you may need to use the 'movesource' and/or 'rename' command.
See Also:
'$ME -h purge' - for what happens to deleted archives. '$ME -h movesource' - for moving archive sources. '$ME -h rename' - for changing user names. '$ME -h restore' - for restoring user directory from archives.
purge
Usage:
$ME [-genopts] purge [-u user] [-m MAX] [-n]
Function:
Shifts and/or deletes archives of expired users. This command: a) Shifts all inactive/deleted archive directories stored in:
$BU_USERS/\$LASTTWO/$DELETED_DIR/
b) Removes inactive/deleted archives with no retention directories left.
Options:
-u user
Just shift and/or purge rsnapshot archives belonging to this user. (Default: All user archives in $DELETED_DIR/ directories)
-m MAX
Only shift and/or purge MAX users.
-n Do not actually shift or remove any rsnapshot archives.
Just list deleted archives on STDOUT.
-v verbose listing - include disk usage
Background:
When CSE users are expired, their backup archives are moved from
$BU_USERS/\$LASTTWO/user[.n]/
to $BU_USERS/\$LASTTWO/$DELETED_DIR/user[.n]/
This command shifts the retention directories in these expired user's backup archives and removes them when they eventually become empty.
fixprimary
Usage:
$ME fixprimary
Function:
Fix primary home archive names if necessary. For every CSE user with a CSE home and a CSE archive, ensure the user's primary home directory is stored in their primary user archive (ie: not stored in an archive with a '.N' suffix).
This command uses renamearch to rename archives if necessary.