| ]

Loss of data is a common problem while maintaining a system. The cause can be hardware or software failure, or end user errors. You can prevent data loss by maintaining system backups. A system log records all activities of the system. While the system is running, backup, application, and daemon processes are recorded in files known as system logs. A daemon is a process that runs in the background and performs a specified operation at predefined intervals or in response to certain events. Linux automatically creates logs, and you can explicitly instruct Linux where and how to maintain them.

This ReferencePoint discusses the need for backups, the types of media used to store backups, how to plan a backup strategy, and perform backup and restore operations. It also discusses the importance of system logs and how they are maintained in Linux.

Backup and Restore

An ideal backup is reliable, easy to use, and can be performed in a short time. You need to be familiar with the advantages and disadvantages of each backup to plan a backup strategy. For example, a backup stored at a distant location from the office is useful for emergency recovery in the event of a fire or theft. But you might not be able to restore data immediately because of the backup being stored off-site.

You need to select a backup that best suits the requirements of your system. The features of a good backup are reliability, speed, and availability. Reliability means that the data is backed up without any errors on the storage media. You also need to maintain the media in a safe place to avoid tampering. The speed of a backup depends on the processor speed and the amount of time you can spare the system for the backup operation. The backed up data should be readily available. To ensure this, maintain one backup set on-site for daily use and another off-site for emergency and disaster restores.

Types of Backups

Linux supports three types of backups:

  • Full backup: Backup of the entire system. This is also known as Level 0 backup. Full backups are efficient for small systems. If the system contains a large amount of data, a full backup is not preferred because it is time-consuming and occupies large amount of storage space.

  • Partial backup: Backup of a specified set of files or directories. You can select the important files or directories, to be included in the backup.

  • Incremental backup: Backup of files that have been modified since the last full backup. This is commonly known as Level 1 backup. The various levels indicate that each incremental backup builds on the previous incremental backup. This means that a Level 2 backup is a backup of the changes made to the data since the last Level 1 backup. You can determine the number of backup levels and specify a maximum of nine levels. Incremental backups are frequent because they are smaller and easier to manage than full backups.


Note

Files that need to be backed up should not be open or in use, otherwise, they are not included in the backup.

Instead of making backups, you can archive or compress data. An archive is a single file that contains a set of files. Data compression involves passing the files through various algorithms to make them smaller. Archives are usually compressed to save system storage space. Archives and compressed files can be stored in different formats, as listed in Table 2-5-1:

Table 2-5-1: Archive and Compression Formats
Open table as spreadsheet

Extension

Type

Create

Open

.gz

Compressed file

gzip

gunzip

.tar

Archive

tar –c

tar -x

.tar.gz

Compressed archive

gzip and tar

gunzip and tar

.zip

Compressed archive

zip

unzip


Note

To learn more about compression, see the Understanding Data Compression ReferencePoint.

You can also make a simple backup. A simple backup involves copying files from one destination to another. It is useful for small systems with limited data. The command to make a simple backup is:

cp –a /myfiles /root/myfilesbackup

This command copies the contents of the /myfiles directory to the /myfilesbackup directory.

Backup Devices

In addition to deciding the type of backup, you need to select the appropriate backup medium. A backup medium is the device used to store the backed up data. Several backup devices are compatible with Linux. The advantages and disadvantages of these devices are listed in Table 2-5-2:

Table 2-5-2: Backup Media
Open table as spreadsheet

Media

Advantages and Disadvantages

Reliability

SCSI tape

Advantages:

Reliable for regular backups.

Can be reused.

Disadvantages:

Capacity is smaller than that of the available hard drives.

Cannot be formatted using Linux.

Good

CDR-Writable / CDR-Re-writable

Advantages:

Economical for medium-sized systems.

Disadvantages:

Capacity is 650 MB only. As a result, it is not a good choice for full backups.

Good

Networked storage

Advantages:

Effective solution for regular backups because data is stored on multiple systems on the network.

As data is stored on multiple systems, the data is available even if one system crashes.

Disadvantages:

Data redundancy.

Good

Superfloppy, Zip, Jaz

Advantages:

Economical and suitable for small systems.

Disadvantages:

Capacity is less than that of the hard drives available.

Excellent

Floppy Disk

Advantages:

Economical and suitable for small systems.

Disadvantages:

Can only be reused two or three times.

Average

Additional onboard Hard Disk (HD) storage

Advantages:

Economical and reusable.

Convenient to use because you do not need to attach any external device to the system.

Disadvantages:

Unreliable because it is limited to a single storage unit.

Good

Removable Hard Disk

Advantages:

Reliable and reusable.

Disadvantages:

Not economical.

Capacity is less than that of the hard drives available.

Excellent

Digital Audio Tape (DAT)

Advantages:

Reliable and reusable.

Capacity varies between 2 to 24 GB, which makes it a suitable device for backing up a large amount of data.

Disadvantages:

Not economical.

Excellent


Note

In Linux, each device attached to the system is treated as a file. Files representing the devices exist in the /dev directory. The system names each device depending on its type. For example, a floppy drive is named /dev/fd0, and a hard disk drive is named /dev/hda0.

Backup and Restore Strategy

To efficiently use the available backups, analyze your system and plan and schedule the backups according to requirements. This backup plan and schedule together make up the backup strategy. A backup strategy depends on the system and the rate of change in data. It should also identify the data to be backed up, the type of backup, and the amount of storage space available.

The files to be backed up include:

  • User files: Files that you or the end users have created

  • Operating system and software files: Files that are required by the system to function properly

The directories that should be backed up are:

  • /home: Stores the directories that you or the end users have created

  • /etc: Stores configuration files created by the system

  • /usr/local: Stores information about the programs installed by the end user on the local system

The directories that should not be backed up are:

  • /tmp: Stores temporary files that are used during the installation and are no longer needed by the system.

  • /proc: Stores data that is automatically generated by the kernel. For example, the file kcore is a dump of the physical memory. You do not need to back up this file because it is a dynamic file.

  • /var: Stores the spool and cache subdirectories. These subdirectories also store dynamic files that do not need to be backed up.


Note

Files that change frequently should be backed up more often than other files.

When scheduling the backup, you need to chart out a plan. One of the most common backup plans is shown in Table 2-5-3:

Table 2-5-3: Backup Schedule-Plan 1
Open table as spreadsheet

Day

Backup Level

Friday

Level 0

Monday

Level 1

Tuesday

Level 1

Wednesday

Level 1

Thursday

Level 1

The plan involves creating a full backup of the system preferably at the end of the week. Creating regular incremental backups helps track any modifications made to the data during the day. The incremental backups made every night from Monday to Thursday along with the full backup made on Friday night ensure an updated backup of the system by the end of the week.

An advantage of using this plan is that restoring the system is easier. You can use the Level 0 backup and the Level 1 backup of the previous evening to restore your system completely. In this plan, you need only two separate media for the backup, one for the Level 0 backup, and the other for the Level 1 backup. You may need additional media if the system undergoes drastic changes over the week.

Another common backup plan used is shown in Table 2-5-4:

Table 2-5-4: Backup Schedule-Plan 2
Open table as spreadsheet

Day

Backup Level

Friday

Level 0

Monday

Level 1

Tuesday

Level 2

Wednesday

Level 3

Thursday

Level 4

This plan is called a multilevel backup. The full backup is made on Friday, and backups are made at different levels from Monday to Thursday. The size of the backed up data decreases with an increase in the level of backup. This is because only the modifications made to the data after the last incremental backup is backed up.

Multilevel backups are small and easy to manage. The disadvantage of this plan is that each backup needs to be made on a separate media. This requires each media to be maintained carefully. If even one media is damaged, it results in errors during the restoration of the system resulting in data loss.

Backup and Restore Tools

Linux provides a set of standard backup and restore tools. There are three types of tools:

  • Command-Line: tar, cpio, dump, and restore

  • Text-Based: taper and Amanda

  • GUI-Based: KDat and Archiver


Note

Tools, such as tar and cpio, can be used for both backup and restore operations. Some tools complement one another. For example, dump is used to make the backup while restore is used to restore the backup.

While selecting a tool for creating and maintaining backups, adhere to these guidelines:

  • Portability across different backup tools: You should be able to restore the backup using a tool that is different from the one used to create the backup. You should also be able to make a backup on one Linux distribution and restore to another, such as from Solaris to Red Hat Linux. To make portable backups, use command-line tools because these are supported by almost all distributions of Linux and Unix.

  • Ease-of-use of the backup tool: The tool used for making backups must provide a user-friendly interface. Linux offers three possible interfaces: Graphical User Interface (GUI), Command-Line Interface (CLI), and the text-based interfaces. GUI is the easiest to use and does not require any knowledge of syntax. CLI should be used if you are comfortable with the required syntax. Text-based interfaces are a combination of GUI and CLI and require basic knowledge of syntax.


Note

You can also use many third party commercial tools for backing up a system. These include Backup and Restore Utility (BRU), PerfectBackup, and Networker. You need to purchase and install these tools on your system.

Automated or unattended backups: You should automate backups so that they are performed at regular intervals without any human intervention. For automated or unattended backups, it is imperative to select a backup type and medium that supports such types of backups. To automate backups, write and execute shell scripts using the cron daemon.


Note

To learn more about cron, see the Linux Administration ReferencePoint.

  • Remote backup facility: You should be able to perform backup and restore operations from a remote system. A tool with a CLI or text-based interface enables scheduling of backups on remote systems.

  • Network backup facility: You should be able to perform backup and restore operations to and from networked hosts. A networked host is a computer on the network. The CLI tool tar supports network access to backup devices.

  • Media types supported by the tool: The backup media must be compatible with the backup tool to be used. The cost of the media with respect to reliability, storage capacity, and transfer speed should be minimal.


Tape Archiver (tar)

Tar is a common CLI tool used in Linux systems. It is a multipurpose tool used for:

  • Archiving: The tar command creates an archive known as tarfile or tarball. You can create a tarfile on a tape or on the local hard disk. This tarfile is a single file that stores a set of related files. When an archive is created, the files existing in the source directory are not destroyed. Restoring an archive does not destroy the structure of the archive. This means the files are restored to their original state without any rearrangement of the directories or the files within them. Archives are also used for long-term purposes.

  • Transporting: Using the tar command, you can create an archive on one system, transfer it, and extract the contents on another system. This means that related files can be transferred over the network as a single unit.

  • Backing up file information: The tar command stores all file attributes, such as file access permissions, information about the end-users and group owners, the size of the files in bytes, and the time when the file was last modified.

The tar command also backs up subdirectories. The GNU’s Not Unix (GNU) version of the tar command is capable of saving multiple files together in a single tape or disk archive. It can also restore individual files from the archive.


Note

GNU is a software system, which is compatible with Unix. It was developed by the Free Software Foundation (FSF). The aim of FSF was to develop non-proprietary software that can be downloaded free of charge, modified, and redistributed.

The tar command was initially used to back up files on a magnetic tape. A magnetic tape stores data in a sequential manner. It cannot store the names of the individual files and merely tracks the position of the file on the tape. When you create a tarfile, the tar command retains the names of the files in addition to storing the files.

The files inside a tarfile are called memberfiles of the archive. You can use the tar command to restore and view memberfiles.

The extension of a tarfile is .tar, but a tarfile with any other extension, such as .abc or .xyz, will also work. The .tar extension is a naming convention used for clarity.

Backing Up and Restoring Using tar

The syntax for backing up and restoring files is same. The only difference is the parameters passed to the command.

The syntax for creating a tarfile is:

tar [taskperformed] [tarfilename] [directoryname]

In this syntax, the taskperformed parameter refers to the various options you can select when creating a tarfile. It determines the task performed by the tar command. tarfilename is the name of the archive you are creating, and directoryname is the name of the directory that contains the files you want to archive.


Note

The subdirectories and the files within the subdirectories of the specified directory are also archived.

Table 2-5-5 lists the basic options used by the taskperformed parameter:

Table 2-5-5: Standard Options of the taskperformed Parameter
Open table as spreadsheet

Option

Description

-A, --concatenate, --catenate

Appends other tarfiles to the end of the archive

-c, –create

Creates a new tarfile

-d, --compare, -diff

Compares the memberfiles with the files in the system and reports any difference in file size, mode, owner, and modification date

-r, --append

Appends files to the existing archive

-t, --list

Lists the contents of the archive

-u, --update

Appends only the most recent files to an existing archive

-x, --extract, --get

Extracts or restores files from the archive

--delete

Deletes memberfiles from the archive

In the taskperformed parameter, you can specify the options to modify tarfiles. These options are listed in Table 2-5-6:

Table 2-5-6: Option Modifiers of the taskperformed Parameter
Open table as spreadsheet

Option

Explanation

-W, --verify

Checks the archive for errors after writing it and verifies whether the archive was written correctly.

--remove-files

Removes the file from the system after appending it to the tarfile.

-k, --keep-old-files

Retains the existing files when extracting files from the archive.

--overwrite

Overwrites the existing files when extracting files from the archive.

--overwrite-dir

Overwrites the directory metadata when extracting files from the archive.

-U, --unlink-first

Deletes the file from the file system before extracting it from the archive.

--recursive-unlink

Destroys the directory structure while archiving and extracts the files into one directory.

-S, --sparse

Enables the effective handling of sparse files. A sparse file contains zero bits. Although no space is allocated to these files, each zero bit is still counted when determining the length of the file.

-O, --to-stdout

Extracts the archive to the standard output, such as floppy disk or tape, during decompression.

-G, --incremental

Enables the tar command to handle incremental backups using the old GNU-format backup.

-g, --listed-incremental=FILE

Enables the tar command to handle incremental backups using the new GNU-format backup.

--ignore-failed-read

Specifies that the tar command should continue processing and not exit when nonzero or unreadable files are encountered.

You can also set the taskperformed parameter to the options for manipulating file attributes, such as the file owner or the group. These options are listed in Table 2-5-7:

Table 2-5-7: Options Used to Handle File Attributes
Open table as spreadsheet

Option

Explanation

--owner=NAME

Specifies a name for the owner of all newly added files

--group=NAME

Specifies a name for the group that owns all newly added files

--atime-preserve

Preserves information about the time when the archived files were last accessed

-m, --modification-time

Extracts information about the time when the file was last modified

--same-owner

Extracts only those files that are owned by the same owner

--no-same-owner

Extracts files that are owned by you

--numeric-owner

Uses numbers instead of names to represent a user or a group

-p, --same-permissions, --preserve-permissions, --preserve-permissions, --preserve

Extracts information pertaining to file permissions

--no-same-permissions

Specifies that information pertaining to permissions of the files should not be extracted

-s, --same-order, --preserver

Sorts the file names and extracts them in the same order as they appear in the archive

The taskpeformed parameter also includes options to select and manipulate media devices. These options are listed in Table 2-5-8:

Table 2-5-8: Options Used to Select and Manipulate Devices
Open table as spreadsheet

Option

Explanation

-f, --file=ARCHIVE

Specifies the tarfile or device name.

--force-local

Indicates that the archive file is a local file even if it is not.

--rsh-command=COMMAND

Uses the COMMAND command to communicate with remote devices.

-[0-7][lmh]

Specifies the drive and its density.

-M, --multi-volume

Indicates that the tar command should operate on a multivolume tarfile. A volume is the set of files that are written to the tape in a single backup session. This is not useful if you are working with a large amount of data where the tarfile does not fit on a single tape.

-L, --tape-length=NUM

Specifies the length of the tape to which the backup is being written. It indicates that the tape should be changed after writing the number of bytes specified by NUM. Leave a blank area of 1024 bytes on the tape after each backup.

-F, --info-script=FILE,

--new-volume-script=FILE

Executes a script file at the end of each tape. Used when performing multitape backups.

--volno-file=FILE

Specifies the file in which the tar utility records the current volume number. It is used in multivolume backups.

The taskperformed parameter has several options for working with the media device. These options are listed in Table 2-5-9:

Table 2-5-9: Options Used when Working with Media Devices
Open table as spreadsheet

Option

Explanation

-b, --blocking-factor=BLOCKS

Specifies the block size. In case of disks, the size is 512 bytes per block.

--record-size=SIZE

Specifies the size of one record in terms of blocks of 512 bytes.

-i, --ignore-zero

Ignores zeroed blocks in the archive. Zeroed blocks indicate End of File (EOF).

-B, --read-full-records

Specifies that the tar command should accept end-of-record markers in the middle of a record or read incomplete records.

Several options of the taskperformed parameter specify the format for the archive. These options are listed in Table 2-5-10:

Table 2-5-10: Options Used to Select the Format of the Archive
Open table as spreadsheet

Option

Explanation

-V, --label=NAME PATTERN

Allows you to associate the archive with a name

-o, --old-archive, --portability

Creates an archive in a format that is compatible with that of the tar command of Unix Ver 7

-j, --bzip2

Creates a compressed archive with the bzip2 compression format

-z, --gzip, --ungzip

Creates a compressed archive with the gzip compression format

-Z, --compress, --uncompress

Creates a simple compressed archive

--use-compress-program=PROG

Creates a compressed archive using a built-in compression program, PROG. PROG must be used in conjunction with the –d option

Several options of the taskperformed parameter are used when working with local files. These options are listed in Table 2-5-11:

Table 2-5-11: Options Used to Select Local Files
Open table as spreadsheet

Option

Explanation

-C, Directory

Changes the current directory to the specified directory.

-T, --files-from=NAME

Retrieves the names of the files to extract from the archive file specified by NAME.

--null

Specifies that the command should ignore the null character, \0, contained in file names and continue processing.

--ignore-case

Overrides case sensitivity of file names when excluding patterns.

--no-ignore-case

Maintains the case sensitivity of file names when excluding patterns.

--wildcards

Uses wildcards in the exclude patterns. This is the default action.

--no-wildcards

Uses plain strings in the exclude patterns.
--wildcards-match-slash Excludes wildcard patterns and matches the slash (/) symbol. This is default action.

--no-wildcards-match-slash

Excludes wildcard patterns and does not match the slash (/) symbol.

-P, --absolute-names

Disables the default removal of the initial slash (/) symbol from the names of the memberfiles.

-h, dereference

Copies the symbolic link. A symbolic link is a softlink similar to a shortcut in Windows Operating System. It points to the file and is not the actual file.

-l, --one-file-system

Creates the archive in the local system.

-N, --newer=DATE,

--after-date=DATE

Adds only the files that have changed since the date specified by DATE.

--newer-mtime=DATE

Compares the date and time of the files existing in the archive and the original files to determine when the data was last modified.

--backup[=CONTROL]

Backs up the original files before removing them.

--suffix-SUFFIX

Automatically suffixes the file name/volume with .tar extension.

The taskperformed parameter also provides several options that are used to provide information about the tarfile. These options are listed in Table 2-5-12:

Table 2-5-12: Operation Options Used to Provide Information
Open table as spreadsheet

Option

Explanation

--help

Prints the associated help file and exits the tar command.

--version

Prints the version number of the tar program being used and exits the tar command

-v, --verbose

Shows the progress of the tar command by listing the files that have been backed up

--checkpoint

Prints the name of the directory in which the file exists while reading the archive

--totals

Prints the total number of bytes written to the archive

-R, --block-number

Shows the record number of each file in the archive

-w, --interactive, --confirmation

Prompts for a confirmation for every action


Note

All the options used with any command are case-sensitive.

Using the tar Command

To use the tar command, you need to be familiar with the syntax of the command and its command-line options. You also need to know how to access the various backup devices.

Performing backups with the tar command is relatively easy. For example, a directory, testdir, contains three text files. You can use the command ls –l, as shown in Figure 2-5-1, to list all the files:

Click to collapse
Figure 2-5-1: Files in the testdir Directory

To archive these files, the tar command is:

tar –c –v –f testarchive.tar file1.txt file2.txt file3.txt

Figure 2-5-2 shows the output of this command:

Click to collapse
Figure 2-5-2: Creating the Tarfile

You can use the –t option to see the contents of the tarfile. The command to view the contents of the tarfiles is:

tar -t -f testarchive.tar

Figure 2-5-3 shows the output of this command:

Click to collapse
Figure 2-5-3: Contents of testarchive.tar File

To view a list of the files contained in the testdir directory, you can use the command ls -l as shown in Figure 2-5-4:

Click to collapse
Figure 2-5-4: List of Files in the testdir Directory After Creating the Archive

You can use the tar command with the –x option to restore data from the backup in a specific backup device. The command used to restore files from the testarchive.tar archive is:

tar –x –v –f testarchive.tar

Figure 2-5-5 shows the output of this command:

Click to collapse
Figure 2-5-5: Files Restored From testarchive.tar File

Note

When performing backups, the data being written to the tape should not be compressed unless the data is not important or is backed up frequently. This is because compressed data can be restored only if the data is correctly written to the tape at the time of making the backup and the tape is not corrupt. Otherwise, you will lose the entire backup. Conversely, backups that are made without compression can be recovered even if portions of the backup are corrupt.

Adding Files to an Existing Archive and Concatenating Archives

The –r option is used to add new files to an existing archive. For example, to add the file named file4.txt to an existing archive named testarchive.tar the command is:

tar –r –f testarchive.tar file4.txt

Concatenating tarfiles enables you to add all the contents of one tarfile at the end of another tarfile. This is done using the -A option with the tar command. For example, to add the contents of the testarchive2.tar tarfile to the testarchive.tar tarfile, the command is:

tar –A –f testarchive.tar testarchive2.tar

Incremental Backups Using tar

When using the tar command to perform incremental backups, you first need to identify the files modified since the last backup. To do this, use the find command. The find command can also be used to find files that are more recent than the files specified.

You can use the –u option along with the tar command to create incremental backups. The disadvantage of –u is that the previous files are not deleted resulting in an increase in the size of the tarfile.

You can also use the –r option of the date command to create incremental backups. For example,

tar –c –f Level1.tar –newer="date –r Level0.tar’" /home

In this command, Level0.tar is an existing full backup archive of the /home directory. The –r or --reference option of the date command scans the system and selects the files that have been modified more recently than the same files in the archive. The –r option returns the date and time of when the file outside the archive was last modified. This return value is used as a reference date by the --newer option to identify the files to be archived.

The –g option of the tar command can also be used to create incremental backups. For example,

tar –c –f Level1.tar –g Level0.tar /home

In this code, Level0.tar is an existing full backup of the /home directory and the –g option creates the incremental backup Level1.tar.

Copy File Archives In and Out (cpio)

Cpio is a CLI tool used to perform backup and restore operations. The GNU version of the cpio tool is compatible with the tar command and is used for creating and extracting archives. It can also be used for copying files from one location to another. Use the cpio tool to copy files in or out of a cpio or tar archive. It can store the archive on most types of backup media in a number of cpio formats and is capable of storing or retrieving large amounts of data.

The advantages of the cpio command over the tar command are:

  • Archives the data efficiently resulting in an optimum usage of storage space.

  • Effectively manages backups that span across several tapes.

  • Omits the corrupted sections of the tape and continues with the backup operation.

  • Facilitates backups of remote systems.

The cpio archive stores files and information, such as the owner of the file, and access permissions for files.

The tool has three modes of operation: copy-out, copy-in, and copy-pass. The copy-out mode copies the files into an archive. This archive can be a file on a hard disk, multiple tapes, or floppy disks. The copy-out option takes the list of files from the standard input, which is the directory in which the files exist. It writes the archive to the standard output, which can be a directory in the system or a device. An efficient method to generate a list of files to archive is to use the find or ls commands. The option used in the copy-out mode of the cpio command is -o. The copy-pass mode copies the files from one directory to another.

Using cpio in Copy-Out Mode

The syntax for the cpio command in copy-out mode is:

cpio {-o|--create} [-0acvABLV] [-C bytes] [-H format] [-M message] [-O [ [user@]host:]archive]

[-F [[user@]host:]archive]
[--file=[[user@]host:]archive] [--format=format]
[--message=message] [--null]
[--reset-access-time] [--verbose] [--dot]
[--append] [--block-size=blocks] [--dereference]
[--io-size=bytes] [--rsh-command=command] [--help]
[--version] <> archive]

The parameters of the syntax are listed in Table 2-5-13:

Table 2-5-13: Parameters of the Syntax
Open table as spreadsheet

Parameter

Description

-o, --create

Creates the archive.

-0, --null

Specifies that the null character, \0, contained in some file names will be ignored and the command will continue processing. This option is used in copy-out and copy-pass modes.

-a, --reset-access-time

Resets the time of access, to the time when the file was previously accessed, after reading the file.

-c

Uses the old portable archive format, which is the American Standards For Character Information Interchange (ASCII) format.

-v, --verbose

Lists the files that have been archived.

-A, --append

Appends the archive to an existing archive.

-B, --block-size=blocks

Sets the input/output (I/O) block size to 5120 bytes. By default, the block size is 512 bytes.

-L, --dereference

Copies the symbolic link.

-V –dot

Prints a period (.) for every processed file.

-C IO-SIZE, --io-size=IO-SIZE

Sets the I/O block size to IO-SIZE bytes.

-H FORMAT, --format=FORMAT

Uses the archive format specified by FORMAT. The cpio command in copy-out mode automatically detects the archive format.

-M MESSAGE, --message=MESSAGE

Prints a message specified by MESSAGE when the end of a backup volume is reached. This message prompts you to insert a new volume. If MESSAGE contains the string %d, the string is replaced by the current volume number starting with 1.

-O archive

Specifies the name for the archive to be used instead of the standard output.

-F, --file=archive

Specifies the name for the archive to be used instead of the standard input or output.

--rsh-command=COMMAND

Uses the COMMAND command to communicate with remote devices.

--help

Opens the help files for the specified command.

--version

Prints the cpio program version number and then exits the cpio command.

<>

Lists the files to be included in the archive.

[> archive]

Redirects the archived files to the specified archive.


Important

To archive files and subdirectories within particular directory, set it as current working directory.

You can use the cpio command with the –o option to create the archive and redirect it to another file. For example, to create an archive using cpio and redirect the archive to the testarchive.cpio file, the command is:

cpio –o –v >testarchive.cpio

Figure 2-5-6 shows the output of this command:

You can view the new archive using the ls –l command, as shown in Figure 2-5-7:

Click to collapse
Figure 2-5-7: Files in the testdir Directory after Creating the Archive

You can obtain a list of archived files using the command:

cpio –t –v –F testarchive.cpio

Using cpio in Copy-In Mode

The copy-in mode of the cpio command restores files from an archive. This process is also known as extracting files from an archive. Destination directories are not created by default. Instead, all the files are extracted to the current directory. You can create the directory into which the files should be extracted using the –d option.

The cpio command does not overwrite the existing files when extracting files from an archive. To overwrite the existing files in the directory, use the –u option.

The cpio command reads the archive from the standard input. The option used is –i.

The syntax for the cpio command in copy-in mode is:

cpio {-i|--extract} [-cdfmnrtuvV] [-C bytes]

[-E file] [-H format] [-M message]
[-R [user][:. ][group]] [-I [[user@]host:]archive]
[-F [[user@]host:]archive]
[--file=[[user@]host:]archive] [--make-directories] [--nonmatching] [--preserve-modification-time]
[--numeric-uid-gid] [--rename] [--list] [--dot]
[--unconditional] [--verbose] [--block-size=blocks] [--swap-halfwords] [--io-size=bytes]
[--pattern-file=file] [--format=format]
[--owner=[user][:. ][group]] [--no-preserve-owner]
[--message=message] [--help] [--version]
[-no-absolute-filenames] [--sparse]
[-only-verify-crc] [-quiet] [--rsh-command=command] [pattern. . . ] [<>

The parameters of the syntax are listed in Table 2-5-14:

Table 2-5-14: Parameters of the Syntax
Open table as spreadsheet

Parameter

Description

-i, --extract

Extracts the files from an archive.

-d, --make-directories

Creates directories.

-f, --nonmatching

Copies only those files that do not match any of the specified patterns.

-m, --preserve-modification-time

Retains the time the file was last modified when creating files.

-n, --numeric-uid-gid

Shows numeric Unique Identifier (UID) and Global Identifier (GID) instead of translating them into names when using the --verbose option.

-r, --rename

Interactively renames files.

-t, --list

Prints a table of contents of the files and directories in the archive.

-u, --unconditional

Replaces all files without verifying whether to replace existing files with older files.

-R [user][:. ][group], --owner [user][:. ][group]

Sets the ownership of all files created to the specified user and/or group. Only the superuser can change the ownership of files.

-I [[user@]host:]archive

Specifies a name for the archive, which will be used instead of the standard input.

--no-preserve-owner

Specifies that the ownership of files will not be changed. This is the default option for all users except the root user. This option is used in copy-in and copy-pass mode only.

--sparse

Writes files with large blocks of zeros as sparse files. This option is used in copy-in and copy-pass modes.

-only-verify-crc

Uses the Cyclical Redundancy Check (CRC) utility to ensure that the data in the archive has no errors. It only verifies the archive without extracting files from the archive.

-quiet

Specifies that the number of blocks copied should not be printed.

<>

Extracts the file from the specified archive.

The remaining parameters are similar to those listed in Table 2-5-13

You can use the cpio command with the –i option to restore an archive. For example, the file testarchive.cpio is the archive. You can restore this archive using the command:

cpio –i –v 

Figure 2-5-8 shows the output of this command:

Click to collapse
Figure 2-5-8: Restoring the Files from testarchive.cpio

Using cpio in Copy-Pass Mode

The copy-pass mode of the cpio combines the copy-out and copy-in modes without working with any archives. In copy-pass mode, the cpio command reads the list of files to be copied from the standard input. The directory to which the cpio command copies the files is passed as an argument to the command. The option used in the copy-pass mode is –p.

The syntax for the cpio command in copy-pass mode is:

cpio {-p|--pass-through} [-0adlmuvLV]

[-R [user][:. ][group]] [--null]
[--reset-access-time] [--make-directories] [--link]
[--preserve-modification-time] [--unconditional]
[--verbose] [--dot] [--dereference]
[--owner=[user][:. ][group]] [--sparse]
[--no-preserve-owner] [--help] [--version] destination-directory <>

The parameters of the syntax are listed in Table 2-5-15:

Table 2-5-15: Parameters of the Syntax
Open table as spreadsheet

Parameter

Description

-p, --pass-through

Copies the files from one directory to another

-l, --link

Links files instead of copying them

destination-directory

Specifies the directory into which the files will be copied

The remaining parameters are similar to those of the copy-out and copy-in modes as listed in Table 2-5-13 and Table 2-5-14, respectively.

The copy-pass mode is an efficient method of copying a large set of files to a directory. For example, your current working directory is /home/user. It contains a number of files: file1.txt, file2.txt, and file3.txt. You want to copy these files into the directory /home/user/textfiles. The command to perform this action is:

cpio -p /home/user/textfiles file1.txt file2.txt file3.txt file4.txt file5.txt

The copy-pass mode is more efficient when used in conjunction with other commands. For example,

find / -name "*.bin" | cpio -p /tmp/bin

In this statement, a combination of cpio and find commands is used to copy all files with the extension .bin into a directory, /tmp/bin. Typically, the find command searches for files in the provided path. But in this case, it searches the entire system to locate the files that match the criteria. The find command prints the list of all the files with the extension .bin. The pipe (|) symbol passes the output of the find command to the cpio command as input.

Dump and Restore

Dump is a commonly used CLI tool for performing backups on Linux systems. It is available with Red Hat Linux distribution or can be downloaded.

The dump package contains several commands to back up and restore files. Commands available in the dump package are listed in Table 2-5-16:

Table 2-5-16: Commands of the Dump Package
Open table as spreadsheet

Command

Description

dump

Creates backups of entire disk or individual directories.

restore

Restores the entire archive or individual files from the archive to the hard disk.

Rmt

Copies files over the network. This command is used with either dump or restore. It is never used separately.

Backing up Using Dump

Dump is compatible with numerous backup media and supports backups that span multiple tapes. This tool is useful for performing incremental backups because it can handle backups from levels 0 to 9.

Compared to other tools, dump adds a checkpoint at the beginning of each volume when creating the backup. If the backup operation fails, the checkpoints enable dump to restart the backup from the point at which the backup failed. The dump utility regularly provides information about its activities during the backup. This information includes estimates of the number of blocks and tapes required for the backup and the time left for the backup to complete.


Note

Dump supports backups on the ext2 filesystem only. ext2 is a filesystem supported by Linux systems only.

To create a backup using the dump command, you need to pass certain parameters to the command. These parameters specify the dump level for incremental backups, the size of the backup media, and other information pertaining to the backup.

The syntax for the dump command is:

dump   

The operation parameter specifies an operation. The arguments parameter specifies a list of any arguments needed by options. The filesystem parameter refers to the set of files or directories that need to be backed up.

The options of the operation parameter are listed in Table 2-5-17:

Table 2-5-17: Options of the Dump Command
Open table as spreadsheet

Option

Description

0 – 9

Specifies the dump level for an incremental backup. The default level is 9.

a autosize

Continues to write the backup on the specified media until an EOF is reached. This default option for most tape devices is particularly useful when appending backups to the same tape.

A archive file

Creates a table of contents that lists the files included in the backup. This is useful when restoring files from the backup.

b blocksize

Specifies the number of kilobytes per record. The default blocksize is 10 KB.

B records

Specifies the number of records per volume or the amount of data that can fit on a single tape. This option takes a numeric argument.

d density

Specifies the density of the tape. By default, this value is set to 1600 bits per inch. This option takes a numeric argument.

f file

Specifies the name of the file or device that will store the backup. You can also specify a remote file or device. This option takes a single alphanumeric argument.

F script

Executes a script file at the end of each tape.

I nr-errors

Used to ignore read errors. By default, dump ignores the first 32 read errors on the file system before prompting for human intervention.

j compression level

Compresses each block that is backed up using the bzlib library. The default compression level is 2.

L label

Labels the backup and stores the label in the header of the backup. This label is a string that you define and has a length of 16 characters. This label is used by the restore tool when restoring files.

M

Indicates that the dump command should operate on a multivolume backup.

q

Aborts the dump command without prompting when human intervention is required.

Q file

Enables the Quick File Access support feature. This creates a file that stores the position of each file within the backup. The option is useful when restoring individual files from the backup.

-s --feet

Specifies the length of the tape in feet.

-S Size estimate

Estimates the amount of space, in bytes, which is required to perform the backup in advance. This estimate is useful when performing incremental backups because it enables you to determine the number of tapes required.

U

Stores the backup in the /etc/dumpdates directory. This is useful when creating incremental backups because all backups are maintained in the same directory.

T date

Specifies a date and time, which is used as a reference for creating incremental backups. Files that are modified or added after the specified date and time are backed up.

W

Lists the files that need to be backed up. The files in the /etc/dumpdates and /etc/fstab files are scanned.

w

Lists the individual files that need to be backed up.

z compression level

Compresses each block that is backed up using the zlib library.

Restoring Files Using Restore

The restore command is used to restore files that have been backed up using the dump command. The restore command works in several modes. These are:

  • Comparison: In this mode, the restore command only compares the original files on the system with the files within the backup created by the dump tool.

  • Interactive: In this mode, you can interactively restore files from the dump using a shell-like interface provided by restore. The interface enables the end user to navigate through a directory and select the files to be extracted from the backup.

  • Quick File Access: In this mode, a Quick File Access file is created from an existing dump file without restoring its contents. This file enables you to locate a particular file without reading the entire backup.

  • Restart Full: In this mode, the restore command restarts a full restore operation on a particular tape of a multivolume set. The restart full restore mode is useful if the restore operation has been interrupted.

  • List: In this mode, the restore command lists the names of the specified files if they are contained in the backup. If no files are specified, the restore command lists the entire contents of the backup.

Using the Restore Command in the Comparison Mode

In the comparison mode, the syntax for the restore command is:

restore -C [-cklMvVy] [-b blocksize] [-D filesystem] [-f file] [-F script] [-L limit] [-s fileno]

[-T directory]

The parameters of the syntax are listed in Table 2-5-18:

Table 2-5-18: Parameters of the Syntax
Open table as spreadsheet

Option

Description

-c

Disables the dynamic checking of the backup and reads a backup made using the old filesystem format. By default, the restore command checks the format of the files that were backed up.

-k

Uses the Linux authentication protocol, Kerberos, to communicate with the remote tape server.

-l

Specifies that a remote compressed file is to be restored. By default, in a remote restoration procedure, the restore command assumes that the remote file is a regular file instead of a tape device.

-M

Enables the multivolume feature for restoring backups made using the -M option of dump.

-v

Lists the name of the file and its type after restoring each file.

-V

Enables reading multivolume mediums other than tapes such as CD-ROMs and floppy disks.

-y

Continues processing the restore command even if errors are encountered.

-b blocksize

Specifies the number of kilobytes per record. If not specified, the restore command automatically determines the blocksize.

-D filesystem

Specifies the name of the file system to be used when checking the file system.

-f file

Reads the backup from a file, such as a tape drive or a disk drive.

-F script

Executes a script at the beginning of each tape. This script is used to determine whether a new tape needs to be inserted into the device for the restoration to continue.

-L limit

Specifies the number of mismatches that can occur when comparing files. When this number is crossed, the restore procedure is aborted and an error message is shown. The default value is 0, which disables the check.

-s fileno

Reads from the specified file number on a tape that contains multiple files.

-T directory

Specifies a directory for temporary files. The default directory is /tmp.

Using the Restore Command in the Interactive Mode

In the interactive mode, the syntax for the restore command is:

restore -i [-achklmMNuvVy] [-A file] [-b blocksize] [-f file] [-F script][-Q file] [-s fileno]

[-T directory]

The parameters of the syntax are listed in Table 2-5-19:

Table 2-5-19: Parameters of the Syntax
Open table as spreadsheet

Parameter

Description

-a

Reads all the volumes of the backup starting with the first volume.

-A file

Reads the table of contents from the backup instead of the media.

-h

Extracts the entire directory.

-m

Extracts the files by inode numbers instead of by file name. Inode numbers are system-assigned identifiers that uniquely identify a file.

-N

Performs a full restore operation without writing any file on the disk.

-u unlink

Removes existing files when new files are created.

-Q file

Uses the Quick File Access file to read the tape. This can be used when restoring from local or remote tapes.

The interactive mode of the restore command also supports interactive commands. These commands are listed in Table 2-5-20:

Table 2-5-20: Interactive Commands
Open table as spreadsheet

Parameter

Description

add

Adds a file or a directory to a list of files to be extracted.

cd

Changes the current directory being viewed within the backup.

delete

Deletes a file or a directory from the list of files to be extracted.

extract

Extracts marked files and directories from the backup to the system.

help

Lists a summary of the available commands.

ls

Lists the contents of the working directory.

pwd

Prints the complete path of the current working directory of the backup.

quit

Exits the interactive mode of the restore command.

setmodes

Sets the mode, time of the previous modification, and owner for the directories that have been added to the list of files to be extracted. It does not extract the files.

verbose

Shows information pertaining to each file that is extracted from the backup during the extraction.

Using the Restore Command in the Quick File Access, Restart Full, and List Modes

In the Quick File Access mode, the syntax for the restore command is:

restore -P file [-achklmMNuvVy] [-A file]

[-b blocksize] [-f file][-F script] [-s fileno]
[-T directory] [-X filelist] [file ...]

In this syntax, the [–X filelist] parameter specifies that files to be extracted from the backup should be listed. The remaining arguments are as listed in Table 2-5-18 and Table 2-5-19.

In the restart full mode, the syntax for the restore command is:

restore -R [-cklMNuvVy] [-b blocksize] [-f file]

[-F script] [-s fileno][-T directory]

The parameters of the syntax are listed in Table 2-5-18 and Table 2-5-19.

In the list mode, the syntax for the restore command is:

restore -t [-chklMNuvVy] [-A file] [-b blocksize]

[-f file] [-F script][-Q file] [-s fileno]
[-T directory] [-X filelist] [file ...]

The parameters of the syntax are listed in Table 2-5-18 and Table 2-5-19.

System Logs

Logs are files that record all activities taking place in the system. Logs that record the activities of programs and applications running in the system are called application logs. Logs that are created and maintained by Linux and record the activities of the system services, are called system logs. For example, the Apache Web server creates logs to track the activities of end users using its services, such as Web site hosting. Linux creates a log to track the activities of the Web server as well as the end users.

System logs are automatically created to maintain an updated record of the activities of various programs and daemons. They provide a real-time indication of how the system is working and cannot be altered. Scanning through log files helps quickly identify the source of problems. As a system administrator, you should ideally view the log files while troubleshooting.

You can customize the size and location of logs on the system. By default, log files are stored in the /var/log directory. The actions or activities of the services are sent to the log files as messages. Instead of storing them in a single, large log file, Linux organizes the messages into smaller files based on the service that sent the message.

The main system logs on a typical Linux installation are:

  • /var/log/bootlog: Messages from the bootup sequence are listed according to date and time. The most recent bootup messages are listed near the end of the file.

  • /var/log/cron: Provides information about tasks that were started and the corresponding time. The cron command is used to schedule tasks.

  • /var/log/dmesg: Messages from or about devices. This is useful for identifying and solving hardware problems.

  • /var/log/maillog: Monitors all e-mail messages.

  • /var/log/messages: Monitors all error messages.

  • /var/log/secure: Monitors logons. You can track the activities of the end users to detect any unauthorized access to the system.

Viewing Log Files Using Commands

Log files are text files that can be opened and viewed using various commands. To view system log files, you must log on as superuser and your working directory should be /var/logs. This directory contains all the system log files. You can use numerous editors, such as EMACS or VI editor, to view the log files. To scroll through a particular log file, use the less command. For example, to scroll through the messages.log file, the command is:

less /var/log/messages.log

The less command enables you to scroll through the log file using the up and down arrow keys and the Page Up and Page Down keys.

To view the most recent entries made in a log file, you need to scroll to the end of the file. At times, the file may be extremely lengthy. You can use the tail command to view the last ten lines of the log file, as:

tail /var/log/messages.log

To view the first ten lines of a file, use the head command as:

head /var/log/messages.log

To view more than ten lines, use the –n option with the head or tail command as:

tail -n 25 /var/log/messages.log


Note

The head and tail commands do no support scrolling up or down the log file. They are used only for a preview of the log files.

Viewing Log Files Using the System Log Viewer

You can view the system logs using the GUI tool, System Log Viewer, which is available with the Mandrake Linux distribution. To start the System Log Viewer application:

  1. In the K Desktop Environment (KDE), click the Start Application button, K->Applications->Monitoring->System Log Viewer. An Input dialog box appears, as shown in Figure 2-5-9:

    Click to collapse
    Figure 2-5-9: The Input Dialog Box

  1. Specify the root password in the Password for root text box and click OK. The System Log Viewer window appears, as shown in Figure 2-5-10:

    Click to collapse
    Figure 2-5-10: The System Log Viewer Window

  1. To view a different system log file, click File->Open Log option. This opens the Open new logfile dialog box, as shown in Figure 2-5-11:

    Click to collapse
    Figure 2-5-11: The Open New Logfile Dialog Box

  1. Select the log file to be opened and click OK. This opens the selected log file.

To monitor a particular log file:

  1. Click File->Monitor option in the System Log Viewer window shown in Figure 2-5-10.

  2. Click Monitor. This opens the Monitor options window, as shown in Figure 2-5-12:

    >." id="IMG_51" src="http://images.books24x7.com/bookimages/id_5049/ch05fig12.jpg" title="Click To expand" border="0" height="115" width="251">Click to collapse
    Figure 2-5-12: The Monitor Options Window

  1. Select the log file to be monitored and click OK. This opens the Monitoring logs dialog box, as shown in Figure 2-5-13:

    Click to collapse
    Figure 2-5-13: The Monitoring logs Dialog Box

To view log statistics, click View->Log Stats option in the System Log Viewer window shown in Figure 2-5-10. This opens the Log stats dialog Box as shown in Figure 2-5-14:

Click to collapse
Figure 2-5-14: The Log Stats Dialog Box

Maintaining logs involves regularly deleting some records from the log files so that they do not get too large. You can open the log file in VI editor and scroll to the top of the log file. To delete the rows, use the command:

Esc  dd

The number_of_lines parameter specifies the number of lines to be deleted. You can undo any changes you make to the log file using the Esc u command.