Loss of data is a common problem while maintaining a system. The cause can be hardware or software failure, or end user errors. You can prevent data loss by maintaining system backups. A system log records all activities of the system. While the system is running, backup, application, and daemon processes are recorded in files known as system logs. A daemon is a process that runs in the background and performs a specified operation at predefined intervals or in response to certain events. Linux automatically creates logs, and you can explicitly instruct Linux where and how to maintain them.
This ReferencePoint discusses the need for backups, the types of media used to store backups, how to plan a backup strategy, and perform backup and restore operations. It also discusses the importance of system logs and how they are maintained in Linux.
Backup and Restore
An ideal backup is reliable, easy to use, and can be performed in a short time. You need to be familiar with the advantages and disadvantages of each backup to plan a backup strategy. For example, a backup stored at a distant location from the office is useful for emergency recovery in the event of a fire or theft. But you might not be able to restore data immediately because of the backup being stored off-site.
You need to select a backup that best suits the requirements of your system. The features of a good backup are reliability, speed, and availability. Reliability means that the data is backed up without any errors on the storage media. You also need to maintain the media in a safe place to avoid tampering. The speed of a backup depends on the processor speed and the amount of time you can spare the system for the backup operation. The backed up data should be readily available. To ensure this, maintain one backup set on-site for daily use and another off-site for emergency and disaster restores.
Types of Backups
Linux supports three types of backups:
-
Full backup: Backup of the entire system. This is also known as Level 0 backup. Full backups are efficient for small systems. If the system contains a large amount of data, a full backup is not preferred because it is time-consuming and occupies large amount of storage space.
-
Partial backup: Backup of a specified set of files or directories. You can select the important files or directories, to be included in the backup.
-
Incremental backup: Backup of files that have been modified since the last full backup. This is commonly known as Level 1 backup. The various levels indicate that each incremental backup builds on the previous incremental backup. This means that a Level 2 backup is a backup of the changes made to the data since the last Level 1 backup. You can determine the number of backup levels and specify a maximum of nine levels. Incremental backups are frequent because they are smaller and easier to manage than full backups.
| Note | Files that need to be backed up should not be open or in use, otherwise, they are not included in the backup. |
Instead of making backups, you can archive or compress data. An archive is a single file that contains a set of files. Data compression involves passing the files through various algorithms to make them smaller. Archives are usually compressed to save system storage space. Archives and compressed files can be stored in different formats, as listed in Table 2-5-1:
Extension | Type | Create | Open |
---|---|---|---|
.gz | Compressed file | gzip | gunzip |
.tar | Archive | tar –c | tar -x |
.tar.gz | Compressed archive | gzip and tar | gunzip and tar |
.zip | Compressed archive | zip | unzip |
| Note | To learn more about compression, see the Understanding Data Compression ReferencePoint. |
You can also make a simple backup. A simple backup involves copying files from one destination to another. It is useful for small systems with limited data. The command to make a simple backup is:
cp –a /myfiles /root/myfilesbackup
This command copies the contents of the /myfiles directory to the /myfilesbackup directory.
Backup Devices
In addition to deciding the type of backup, you need to select the appropriate backup medium. A backup medium is the device used to store the backed up data. Several backup devices are compatible with Linux. The advantages and disadvantages of these devices are listed in Table 2-5-2:
Media | Advantages and Disadvantages | Reliability |
---|---|---|
SCSI tape | Advantages: Reliable for regular backups. Can be reused. Disadvantages: Capacity is smaller than that of the available hard drives. Cannot be formatted using Linux. | Good |
CDR-Writable / CDR-Re-writable | Advantages: Economical for medium-sized systems. Disadvantages: Capacity is 650 MB only. As a result, it is not a good choice for full backups. | Good |
Networked storage | Advantages: Effective solution for regular backups because data is stored on multiple systems on the network. As data is stored on multiple systems, the data is available even if one system crashes. Disadvantages: Data redundancy. | Good |
Superfloppy, Zip, Jaz | Advantages: Economical and suitable for small systems. Disadvantages: Capacity is less than that of the hard drives available. | Excellent |
Floppy Disk | Advantages: Economical and suitable for small systems. Disadvantages: Can only be reused two or three times. | Average |
Additional onboard Hard Disk (HD) storage | Advantages: Economical and reusable. Convenient to use because you do not need to attach any external device to the system. Disadvantages: Unreliable because it is limited to a single storage unit. | Good |
Removable Hard Disk | Advantages: Reliable and reusable. Disadvantages: Not economical. Capacity is less than that of the hard drives available. | Excellent |
Digital Audio Tape (DAT) | Advantages: Reliable and reusable. Capacity varies between 2 to 24 GB, which makes it a suitable device for backing up a large amount of data. Disadvantages: Not economical. | Excellent |
| Note | In Linux, each device attached to the system is treated as a file. Files representing the devices exist in the /dev directory. The system names each device depending on its type. For example, a floppy drive is named /dev/fd0, and a hard disk drive is named /dev/hda0. |
Backup and Restore Strategy
To efficiently use the available backups, analyze your system and plan and schedule the backups according to requirements. This backup plan and schedule together make up the backup strategy. A backup strategy depends on the system and the rate of change in data. It should also identify the data to be backed up, the type of backup, and the amount of storage space available.
The files to be backed up include:
-
User files: Files that you or the end users have created
-
Operating system and software files: Files that are required by the system to function properly
The directories that should be backed up are:
-
/home: Stores the directories that you or the end users have created
-
/etc: Stores configuration files created by the system
-
/usr/local: Stores information about the programs installed by the end user on the local system
The directories that should not be backed up are:
-
/tmp: Stores temporary files that are used during the installation and are no longer needed by the system.
-
/proc: Stores data that is automatically generated by the kernel. For example, the file kcore is a dump of the physical memory. You do not need to back up this file because it is a dynamic file.
-
/var: Stores the spool and cache subdirectories. These subdirectories also store dynamic files that do not need to be backed up.
| Note | Files that change frequently should be backed up more often than other files. |
When scheduling the backup, you need to chart out a plan. One of the most common backup plans is shown in Table 2-5-3:
Day | Backup Level |
---|---|
Friday | Level 0 |
Monday | Level 1 |
Tuesday | Level 1 |
Wednesday | Level 1 |
Thursday | Level 1 |
The plan involves creating a full backup of the system preferably at the end of the week. Creating regular incremental backups helps track any modifications made to the data during the day. The incremental backups made every night from Monday to Thursday along with the full backup made on Friday night ensure an updated backup of the system by the end of the week.
An advantage of using this plan is that restoring the system is easier. You can use the Level 0 backup and the Level 1 backup of the previous evening to restore your system completely. In this plan, you need only two separate media for the backup, one for the Level 0 backup, and the other for the Level 1 backup. You may need additional media if the system undergoes drastic changes over the week.
Another common backup plan used is shown in Table 2-5-4:
Day | Backup Level |
---|---|
Friday | Level 0 |
Monday | Level 1 |
Tuesday | Level 2 |
Wednesday | Level 3 |
Thursday | Level 4 |
This plan is called a multilevel backup. The full backup is made on Friday, and backups are made at different levels from Monday to Thursday. The size of the backed up data decreases with an increase in the level of backup. This is because only the modifications made to the data after the last incremental backup is backed up.
Multilevel backups are small and easy to manage. The disadvantage of this plan is that each backup needs to be made on a separate media. This requires each media to be maintained carefully. If even one media is damaged, it results in errors during the restoration of the system resulting in data loss.
Backup and Restore Tools
Linux provides a set of standard backup and restore tools. There are three types of tools:
-
Text-Based: taper and Amanda
-
GUI-Based: KDat and Archiver
While selecting a tool for creating and maintaining backups, adhere to these guidelines:
-
Portability across different backup tools: You should be able to restore the backup using a tool that is different from the one used to create the backup. You should also be able to make a backup on one Linux distribution and restore to another, such as from Solaris to Red Hat Linux. To make portable backups, use command-line tools because these are supported by almost all distributions of Linux and Unix.
-
Ease-of-use of the backup tool: The tool used for making backups must provide a user-friendly interface. Linux offers three possible interfaces: Graphical User Interface (GUI), Command-Line Interface (CLI), and the text-based interfaces. GUI is the easiest to use and does not require any knowledge of syntax. CLI should be used if you are comfortable with the required syntax. Text-based interfaces are a combination of GUI and CLI and require basic knowledge of syntax.
| Note | You can also use many third party commercial tools for backing up a system. These include Backup and Restore Utility (BRU), PerfectBackup, and Networker. You need to purchase and install these tools on your system. |
Automated or unattended backups: You should automate backups so that they are performed at regular intervals without any human intervention. For automated or unattended backups, it is imperative to select a backup type and medium that supports such types of backups. To automate backups, write and execute shell scripts using the cron daemon.
| Note | To learn more about cron, see the Linux Administration ReferencePoint. |
-
Remote backup facility: You should be able to perform backup and restore operations from a remote system. A tool with a CLI or text-based interface enables scheduling of backups on remote systems.
-
Network backup facility: You should be able to perform backup and restore operations to and from networked hosts. A networked host is a computer on the network. The CLI tool tar supports network access to backup devices.
-
Media types supported by the tool: The backup media must be compatible with the backup tool to be used. The cost of the media with respect to reliability, storage capacity, and transfer speed should be minimal.
Tape Archiver (tar)
Tar is a common CLI tool used in Linux systems. It is a multipurpose tool used for:
-
Archiving: The tar command creates an archive known as tarfile or tarball. You can create a tarfile on a tape or on the local hard disk. This tarfile is a single file that stores a set of related files. When an archive is created, the files existing in the source directory are not destroyed. Restoring an archive does not destroy the structure of the archive. This means the files are restored to their original state without any rearrangement of the directories or the files within them. Archives are also used for long-term purposes.
-
Transporting: Using the tar command, you can create an archive on one system, transfer it, and extract the contents on another system. This means that related files can be transferred over the network as a single unit.
-
Backing up file information: The tar command stores all file attributes, such as file access permissions, information about the end-users and group owners, the size of the files in bytes, and the time when the file was last modified.
The tar command also backs up subdirectories. The GNU’s Not Unix (GNU) version of the tar command is capable of saving multiple files together in a single tape or disk archive. It can also restore individual files from the archive.
The tar command was initially used to back up files on a magnetic tape. A magnetic tape stores data in a sequential manner. It cannot store the names of the individual files and merely tracks the position of the file on the tape. When you create a tarfile, the tar command retains the names of the files in addition to storing the files.
The files inside a tarfile are called memberfiles of the archive. You can use the tar command to restore and view memberfiles.
The extension of a tarfile is .tar, but a tarfile with any other extension, such as .abc or .xyz, will also work. The .tar extension is a naming convention used for clarity.
Backing Up and Restoring Using tar
The syntax for backing up and restoring files is same. The only difference is the parameters passed to the command.
The syntax for creating a tarfile is:
tar [taskperformed] [tarfilename] [directoryname]
In this syntax, the taskperformed parameter refers to the various options you can select when creating a tarfile. It determines the task performed by the tar command. tarfilename is the name of the archive you are creating, and directoryname is the name of the directory that contains the files you want to archive.
| Note | The subdirectories and the files within the subdirectories of the specified directory are also archived. |
Table 2-5-5 lists the basic options used by the taskperformed parameter:
Option | Description |
---|---|
-A, --concatenate, --catenate | Appends other tarfiles to the end of the archive |
-c, –create | Creates a new tarfile |
-d, --compare, -diff | Compares the memberfiles with the files in the system and reports any difference in file size, mode, owner, and modification date |
-r, --append | Appends files to the existing archive |
-t, --list | Lists the contents of the archive |
-u, --update | Appends only the most recent files to an existing archive |
-x, --extract, --get | Extracts or restores files from the archive |
--delete | Deletes memberfiles from the archive |
In the taskperformed parameter, you can specify the options to modify tarfiles. These options are listed in Table 2-5-6:
Option | Explanation |
---|---|
-W, --verify | Checks the archive for errors after writing it and verifies whether the archive was written correctly. |
--remove-files | Removes the file from the system after appending it to the tarfile. |
-k, --keep-old-files | Retains the existing files when extracting files from the archive. |
--overwrite | Overwrites the existing files when extracting files from the archive. |
--overwrite-dir | Overwrites the directory metadata when extracting files from the archive. |
-U, --unlink-first | Deletes the file from the file system before extracting it from the archive. |
--recursive-unlink | Destroys the directory structure while archiving and extracts the files into one directory. |
-S, --sparse | Enables the effective handling of sparse files. A sparse file contains zero bits. Although no space is allocated to these files, each zero bit is still counted when determining the length of the file. |
-O, --to-stdout | Extracts the archive to the standard output, such as floppy disk or tape, during decompression. |
-G, --incremental | Enables the tar command to handle incremental backups using the old GNU-format backup. |
-g, --listed-incremental=FILE | Enables the tar command to handle incremental backups using the new GNU-format backup. |
--ignore-failed-read | Specifies that the tar command should continue processing and not exit when nonzero or unreadable files are encountered. |
You can also set the taskperformed parameter to the options for manipulating file attributes, such as the file owner or the group. These options are listed in Table 2-5-7:
Option | Explanation |
---|---|
--owner=NAME | Specifies a name for the owner of all newly added files |
--group=NAME | Specifies a name for the group that owns all newly added files |
--atime-preserve | Preserves information about the time when the archived files were last accessed |
-m, --modification-time | Extracts information about the time when the file was last modified |
--same-owner | Extracts only those files that are owned by the same owner |
--no-same-owner | Extracts files that are owned by you |
--numeric-owner | Uses numbers instead of names to represent a user or a group |
-p, --same-permissions, --preserve-permissions, --preserve-permissions, --preserve | Extracts information pertaining to file permissions |
--no-same-permissions | Specifies that information pertaining to permissions of the files should not be extracted |
-s, --same-order, --preserver | Sorts the file names and extracts them in the same order as they appear in the archive |
The taskpeformed parameter also includes options to select and manipulate media devices. These options are listed in Table 2-5-8:
Option | Explanation |
---|---|
-f, --file=ARCHIVE | Specifies the tarfile or device name. |
--force-local | Indicates that the archive file is a local file even if it is not. |
--rsh-command=COMMAND | Uses the COMMAND command to communicate with remote devices. |
-[0-7][lmh] | Specifies the drive and its density. |
-M, --multi-volume | Indicates that the tar command should operate on a multivolume tarfile. A volume is the set of files that are written to the tape in a single backup session. This is not useful if you are working with a large amount of data where the tarfile does not fit on a single tape. |
-L, --tape-length=NUM | Specifies the length of the tape to which the backup is being written. It indicates that the tape should be changed after writing the number of bytes specified by NUM. Leave a blank area of 1024 bytes on the tape after each backup. |
-F, --info-script=FILE, --new-volume-script=FILE | Executes a script file at the end of each tape. Used when performing multitape backups. |
--volno-file=FILE | Specifies the file in which the tar utility records the current volume number. It is used in multivolume backups. |
The taskperformed parameter has several options for working with the media device. These options are listed in Table 2-5-9:
Option | Explanation |
---|---|
-b, --blocking-factor=BLOCKS | Specifies the block size. In case of disks, the size is 512 bytes per block. |
--record-size=SIZE | Specifies the size of one record in terms of blocks of 512 bytes. |
-i, --ignore-zero | Ignores zeroed blocks in the archive. Zeroed blocks indicate End of File (EOF). |
-B, --read-full-records | Specifies that the tar command should accept end-of-record markers in the middle of a record or read incomplete records. |
Several options of the taskperformed parameter specify the format for the archive. These options are listed in Table 2-5-10:
Option | Explanation |
---|---|
-V, --label=NAME PATTERN | Allows you to associate the archive with a name |
-o, --old-archive, --portability | Creates an archive in a format that is compatible with that of the tar command of Unix Ver 7 |
-j, --bzip2 | Creates a compressed archive with the bzip2 compression format |
-z, --gzip, --ungzip | Creates a compressed archive with the gzip compression format |
-Z, --compress, --uncompress | Creates a simple compressed archive |
--use-compress-program=PROG | Creates a compressed archive using a built-in compression program, PROG. PROG must be used in conjunction with the –d option |
Several options of the taskperformed parameter are used when working with local files. These options are listed in Table 2-5-11:
Option | Explanation |
---|---|
-C, Directory | Changes the current directory to the specified directory. |
-T, --files-from=NAME | Retrieves the names of the files to extract from the archive file specified by NAME. |
--null | Specifies that the command should ignore the null character, \0, contained in file names and continue processing. |
--ignore-case | Overrides case sensitivity of file names when excluding patterns. |
--no-ignore-case | Maintains the case sensitivity of file names when excluding patterns. |
--wildcards | Uses wildcards in the exclude patterns. This is the default action. |
--no-wildcards | Uses plain strings in the exclude patterns. |
--wildcards-match-slash | Excludes wildcard patterns and matches the slash (/) symbol. This is default action. |
--no-wildcards-match-slash | Excludes wildcard patterns and does not match the slash (/) symbol. |
-P, --absolute-names | Disables the default removal of the initial slash (/) symbol from the names of the memberfiles. |
-h, dereference | Copies the symbolic link. A symbolic link is a softlink similar to a shortcut in Windows Operating System. It points to the file and is not the actual file. |
-l, --one-file-system | Creates the archive in the local system. |
-N, --newer=DATE, --after-date=DATE | Adds only the files that have changed since the date specified by DATE. |
--newer-mtime=DATE | Compares the date and time of the files existing in the archive and the original files to determine when the data was last modified. |
--backup[=CONTROL] | Backs up the original files before removing them. |
--suffix-SUFFIX | Automatically suffixes the file name/volume with .tar extension. |
The taskperformed parameter also provides several options that are used to provide information about the tarfile. These options are listed in Table 2-5-12:
Option | Explanation |
---|---|
--help | Prints the associated help file and exits the tar command. |
--version | Prints the version number of the tar program being used and exits the tar command |
-v, --verbose | Shows the progress of the tar command by listing the files that have been backed up |
--checkpoint | Prints the name of the directory in which the file exists while reading the archive |
--totals | Prints the total number of bytes written to the archive |
-R, --block-number | Shows the record number of each file in the archive |
-w, --interactive, --confirmation | Prompts for a confirmation for every action |
| Note | All the options used with any command are case-sensitive. |
Using the tar Command
To use the tar command, you need to be familiar with the syntax of the command and its command-line options. You also need to know how to access the various backup devices.
Performing backups with the tar command is relatively easy. For example, a directory, testdir, contains three text files. You can use the command ls –l, as shown in Figure 2-5-1, to list all the files:
To archive these files, the tar command is:
tar –c –v –f testarchive.tar file1.txt file2.txt file3.txt
Figure 2-5-2 shows the output of this command:
You can use the –t option to see the contents of the tarfile. The command to view the contents of the tarfiles is:
tar -t -f testarchive.tar
Figure 2-5-3 shows the output of this command:
To view a list of the files contained in the testdir directory, you can use the command ls -l as shown in Figure 2-5-4:
You can use the tar command with the –x option to restore data from the backup in a specific backup device. The command used to restore files from the testarchive.tar archive is:
tar –x –v –f testarchive.tar
Figure 2-5-5 shows the output of this command:
| Note | When performing backups, the data being written to the tape should not be compressed unless the data is not important or is backed up frequently. This is because compressed data can be restored only if the data is correctly written to the tape at the time of making the backup and the tape is not corrupt. Otherwise, you will lose the entire backup. Conversely, backups that are made without compression can be recovered even if portions of the backup are corrupt. |
Adding Files to an Existing Archive and Concatenating Archives
The –r option is used to add new files to an existing archive. For example, to add the file named file4.txt to an existing archive named testarchive.tar the command is:
tar –r –f testarchive.tar file4.txt
Concatenating tarfiles enables you to add all the contents of one tarfile at the end of another tarfile. This is done using the -A option with the tar command. For example, to add the contents of the testarchive2.tar tarfile to the testarchive.tar tarfile, the command is:
tar –A –f testarchive.tar testarchive2.tar
Incremental Backups Using tar
When using the tar command to perform incremental backups, you first need to identify the files modified since the last backup. To do this, use the find command. The find command can also be used to find files that are more recent than the files specified.
You can use the –u option along with the tar command to create incremental backups. The disadvantage of –u is that the previous files are not deleted resulting in an increase in the size of the tarfile.
You can also use the –r option of the date command to create incremental backups. For example,
tar –c –f Level1.tar –newer="date –r Level0.tar’" /home
In this command, Level0.tar is an existing full backup archive of the /home directory. The –r or --reference option of the date command scans the system and selects the files that have been modified more recently than the same files in the archive. The –r option returns the date and time of when the file outside the archive was last modified. This return value is used as a reference date by the --newer option to identify the files to be archived.
The –g option of the tar command can also be used to create incremental backups. For example,
tar –c –f Level1.tar –g Level0.tar /home
In this code, Level0.tar is an existing full backup of the /home directory and the –g option creates the incremental backup Level1.tar.
Copy File Archives In and Out (cpio)
Cpio is a CLI tool used to perform backup and restore operations. The GNU version of the cpio tool is compatible with the tar command and is used for creating and extracting archives. It can also be used for copying files from one location to another. Use the cpio tool to copy files in or out of a cpio or tar archive. It can store the archive on most types of backup media in a number of cpio formats and is capable of storing or retrieving large amounts of data.
The advantages of the cpio command over the tar command are:
-
Archives the data efficiently resulting in an optimum usage of storage space.
-
Effectively manages backups that span across several tapes.
-
Omits the corrupted sections of the tape and continues with the backup operation.
-
Facilitates backups of remote systems.
The cpio archive stores files and information, such as the owner of the file, and access permissions for files.
The tool has three modes of operation: copy-out, copy-in, and copy-pass. The copy-out mode copies the files into an archive. This archive can be a file on a hard disk, multiple tapes, or floppy disks. The copy-out option takes the list of files from the standard input, which is the directory in which the files exist. It writes the archive to the standard output, which can be a directory in the system or a device. An efficient method to generate a list of files to archive is to use the find or ls commands. The option used in the copy-out mode of the cpio command is -o. The copy-pass mode copies the files from one directory to another.
Using cpio in Copy-Out Mode
The syntax for the cpio command in copy-out mode is:
cpio {-o|--create} [-0acvABLV] [-C bytes] [-H format] [-M message] [-O [ [user@]host:]archive]
[-F [[user@]host:]archive]
[--file=[[user@]host:]archive] [--format=format]
[--message=message] [--null]
[--reset-access-time] [--verbose] [--dot]
[--append] [--block-size=blocks] [--dereference]
[--io-size=bytes] [--rsh-command=command] [--help]
[--version] <> archive]
The parameters of the syntax are listed in Table 2-5-13:
Parameter | Description |
---|---|
-o, --create | Creates the archive. |
-0, --null | Specifies that the null character, \0, contained in some file names will be ignored and the command will continue processing. This option is used in copy-out and copy-pass modes. |
-a, --reset-access-time | Resets the time of access, to the time when the file was previously accessed, after reading the file. |
-c | Uses the old portable archive format, which is the American Standards For Character Information Interchange (ASCII) format. |
-v, --verbose | Lists the files that have been archived. |
-A, --append | Appends the archive to an existing archive. |
-B, --block-size=blocks | Sets the input/output (I/O) block size to 5120 bytes. By default, the block size is 512 bytes. |
-L, --dereference | Copies the symbolic link. |
-V –dot | Prints a period (.) for every processed file. |
-C IO-SIZE, --io-size=IO-SIZE | Sets the I/O block size to IO-SIZE bytes. |
-H FORMAT, --format=FORMAT | Uses the archive format specified by FORMAT. The cpio command in copy-out mode automatically detects the archive format. |
-M MESSAGE, --message=MESSAGE | Prints a message specified by MESSAGE when the end of a backup volume is reached. This message prompts you to insert a new volume. If MESSAGE contains the string %d, the string is replaced by the current volume number starting with 1. |
-O archive | Specifies the name for the archive to be used instead of the standard output. |
-F, --file=archive | Specifies the name for the archive to be used instead of the standard input or output. |
--rsh-command=COMMAND | Uses the COMMAND command to communicate with remote devices. |
--help | Opens the help files for the specified command. |
--version | Prints the cpio program version number and then exits the cpio command. |
<> | Lists the files to be included in the archive. |
[> archive] | Redirects the archived files to the specified archive. |
| Important | To archive files and subdirectories within particular directory, set it as current working directory. |
You can use the cpio command with the –o option to create the archive and redirect it to another file. For example, to create an archive using cpio and redirect the archive to the testarchive.cpio file, the command is:
cpio –o –v >testarchive.cpio
Figure 2-5-6 shows the output of this command:
Figure 2-5-6: Output of the cpio Command in the Copy-Out Mode
You can view the new archive using the ls –l command, as shown in Figure 2-5-7:
You can obtain a list of archived files using the command:
cpio –t –v –F testarchive.cpio
Using cpio in Copy-In Mode
The copy-in mode of the cpio command restores files from an archive. This process is also known as extracting files from an archive. Destination directories are not created by default. Instead, all the files are extracted to the current directory. You can create the directory into which the files should be extracted using the –d option.
The cpio command does not overwrite the existing files when extracting files from an archive. To overwrite the existing files in the directory, use the –u option.
The cpio command reads the archive from the standard input. The option used is –i.
The syntax for the cpio command in copy-in mode is:
cpio {-i|--extract} [-cdfmnrtuvV] [-C bytes]
[-E file] [-H format] [-M message]
[-R [user][:. ][group]] [-I [[user@]host:]archive]
[-F [[user@]host:]archive]
[--file=[[user@]host:]archive] [--make-directories] [--nonmatching] [--preserve-modification-time]
[--numeric-uid-gid] [--rename] [--list] [--dot]
[--unconditional] [--verbose] [--block-size=blocks] [--swap-halfwords] [--io-size=bytes]
[--pattern-file=file] [--format=format]
[--owner=[user][:. ][group]] [--no-preserve-owner]
[--message=message] [--help] [--version]
[-no-absolute-filenames] [--sparse]
[-only-verify-crc] [-quiet] [--rsh-command=command] [pattern. . . ] [<>
The parameters of the syntax are listed in Table 2-5-14:
Parameter | Description |
---|---|
-i, --extract | Extracts the files from an archive. |
-d, --make-directories | Creates directories. |
-f, --nonmatching | Copies only those files that do not match any of the specified patterns. |
-m, --preserve-modification-time | Retains the time the file was last modified when creating files. |
-n, --numeric-uid-gid | Shows numeric Unique Identifier (UID) and Global Identifier (GID) instead of translating them into names when using the --verbose option. |
-r, --rename | Interactively renames files. |
-t, --list | Prints a table of contents of the files and directories in the archive. |
-u, --unconditional | Replaces all files without verifying whether to replace existing files with older files. |
-R [user][:. ][group], --owner [user][:. ][group] | Sets the ownership of all files created to the specified user and/or group. Only the superuser can change the ownership of files. |
-I [[user@]host:]archive | Specifies a name for the archive, which will be used instead of the standard input. |
--no-preserve-owner | Specifies that the ownership of files will not be changed. This is the default option for all users except the root user. This option is used in copy-in and copy-pass mode only. |
--sparse | Writes files with large blocks of zeros as sparse files. This option is used in copy-in and copy-pass modes. |
-only-verify-crc | Uses the Cyclical Redundancy Check (CRC) utility to ensure that the data in the archive has no errors. It only verifies the archive without extracting files from the archive. |
-quiet | Specifies that the number of blocks copied should not be printed. |
<> | Extracts the file from the specified archive. |
The remaining parameters are similar to those listed in Table 2-5-13
You can use the cpio command with the –i option to restore an archive. For example, the file testarchive.cpio is the archive. You can restore this archive using the command:
cpio –i –v
Figure 2-5-8 shows the output of this command:
Using cpio in Copy-Pass Mode
The copy-pass mode of the cpio combines the copy-out and copy-in modes without working with any archives. In copy-pass mode, the cpio command reads the list of files to be copied from the standard input. The directory to which the cpio command copies the files is passed as an argument to the command. The option used in the copy-pass mode is –p.
The syntax for the cpio command in copy-pass mode is:
cpio {-p|--pass-through} [-0adlmuvLV]
[-R [user][:. ][group]] [--null]
[--reset-access-time] [--make-directories] [--link]
[--preserve-modification-time] [--unconditional]
[--verbose] [--dot] [--dereference]
[--owner=[user][:. ][group]] [--sparse]
[--no-preserve-owner] [--help] [--version] destination-directory <>
The parameters of the syntax are listed in Table 2-5-15:
Parameter | Description |
---|---|
-p, --pass-through | Copies the files from one directory to another |
-l, --link | Links files instead of copying them |
destination-directory | Specifies the directory into which the files will be copied |
The remaining parameters are similar to those of the copy-out and copy-in modes as listed in Table 2-5-13 and Table 2-5-14, respectively.
The copy-pass mode is an efficient method of copying a large set of files to a directory. For example, your current working directory is /home/user. It contains a number of files: file1.txt, file2.txt, and file3.txt. You want to copy these files into the directory /home/user/textfiles. The command to perform this action is:
cpio -p /home/user/textfiles file1.txt file2.txt file3.txt file4.txt file5.txt
The copy-pass mode is more efficient when used in conjunction with other commands. For example,
find / -name "*.bin" | cpio -p /tmp/bin
In this statement, a combination of cpio and find commands is used to copy all files with the extension .bin into a directory, /tmp/bin. Typically, the find command searches for files in the provided path. But in this case, it searches the entire system to locate the files that match the criteria. The find command prints the list of all the files with the extension .bin. The pipe (|) symbol passes the output of the find command to the cpio command as input.
Dump and Restore
Dump is a commonly used CLI tool for performing backups on Linux systems. It is available with Red Hat Linux distribution or can be downloaded.
The dump package contains several commands to back up and restore files. Commands available in the dump package are listed in Table 2-5-16:
Command | Description |
---|---|
Creates backups of entire disk or individual directories. | |
Restores the entire archive or individual files from the archive to the hard disk. | |
Rmt | Copies files over the network. This command is used with either dump or restore. It is never used separately. |
Backing up Using Dump
Dump is compatible with numerous backup media and supports backups that span multiple tapes. This tool is useful for performing incremental backups because it can handle backups from levels 0 to 9.
Compared to other tools, dump adds a checkpoint at the beginning of each volume when creating the backup. If the backup operation fails, the checkpoints enable dump to restart the backup from the point at which the backup failed. The dump utility regularly provides information about its activities during the backup. This information includes estimates of the number of blocks and tapes required for the backup and the time left for the backup to complete.
| Note | Dump supports backups on the ext2 filesystem only. ext2 is a filesystem supported by Linux systems only. |
To create a backup using the dump command, you need to pass certain parameters to the command. These parameters specify the dump level for incremental backups, the size of the backup media, and other information pertaining to the backup.
The syntax for the dump command is:
dump
The operation parameter specifies an operation. The arguments parameter specifies a list of any arguments needed by options. The filesystem parameter refers to the set of files or directories that need to be backed up.
The options of the operation parameter are listed in Table 2-5-17:
Option | Description |
---|---|
0 – 9 | Specifies the dump level for an incremental backup. The default level is 9. |
a autosize | Continues to write the backup on the specified media until an EOF is reached. This default option for most tape devices is particularly useful when appending backups to the same tape. |
A archive file | Creates a table of contents that lists the files included in the backup. This is useful when restoring files from the backup. |
b blocksize | Specifies the number of kilobytes per record. The default blocksize is 10 KB. |
B records | Specifies the number of records per volume or the amount of data that can fit on a single tape. This option takes a numeric argument. |
d density | Specifies the density of the tape. By default, this value is set to 1600 bits per inch. This option takes a numeric argument. |
f file | Specifies the name of the file or device that will store the backup. You can also specify a remote file or device. This option takes a single alphanumeric argument. |
F script | Executes a script file at the end of each tape. |
I nr-errors | Used to ignore read errors. By default, dump ignores the first 32 read errors on the file system before prompting for human intervention. |
j compression level | Compresses each block that is backed up using the bzlib library. The default compression level is 2. |
L label | Labels the backup and stores the label in the header of the backup. This label is a string that you define and has a length of 16 characters. This label is used by the restore tool when restoring files. |
M | Indicates that the dump command should operate on a multivolume backup. |
q | Aborts the dump command without prompting when human intervention is required. |
Q file | Enables the Quick File Access support feature. This creates a file that stores the position of each file within the backup. The option is useful when restoring individual files from the backup. |
-s --feet | Specifies the length of the tape in feet. |
-S Size estimate | Estimates the amount of space, in bytes, which is required to perform the backup in advance. This estimate is useful when performing incremental backups because it enables you to determine the number of tapes required. |
U | Stores the backup in the /etc/dumpdates directory. This is useful when creating incremental backups because all backups are maintained in the same directory. |
T date | Specifies a date and time, which is used as a reference for creating incremental backups. Files that are modified or added after the specified date and time are backed up. |
W | Lists the files that need to be backed up. The files in the /etc/dumpdates and /etc/fstab files are scanned. |
w | Lists the individual files that need to be backed up. |
z compression level | Compresses each block that is backed up using the zlib library. |
Restoring Files Using Restore
The restore command is used to restore files that have been backed up using the dump command. The restore command works in several modes. These are:
-
Comparison: In this mode, the restore command only compares the original files on the system with the files within the backup created by the dump tool.
-
Interactive: In this mode, you can interactively restore files from the dump using a shell-like interface provided by restore. The interface enables the end user to navigate through a directory and select the files to be extracted from the backup.
-
Quick File Access: In this mode, a Quick File Access file is created from an existing dump file without restoring its contents. This file enables you to locate a particular file without reading the entire backup.
-
Restart Full: In this mode, the restore command restarts a full restore operation on a particular tape of a multivolume set. The restart full restore mode is useful if the restore operation has been interrupted.
-
List: In this mode, the restore command lists the names of the specified files if they are contained in the backup. If no files are specified, the restore command lists the entire contents of the backup.
Using the Restore Command in the Comparison Mode
In the comparison mode, the syntax for the restore command is:
restore -C [-cklMvVy] [-b blocksize] [-D filesystem] [-f file] [-F script] [-L limit] [-s fileno]
[-T directory]
The parameters of the syntax are listed in Table 2-5-18:
Option | Description |
---|---|
-c | Disables the dynamic checking of the backup and reads a backup made using the old filesystem format. By default, the restore command checks the format of the files that were backed up. |
-k | Uses the Linux authentication protocol, Kerberos, to communicate with the remote tape server. |
-l | Specifies that a remote compressed file is to be restored. By default, in a remote restoration procedure, the restore command assumes that the remote file is a regular file instead of a tape device. |
-M | Enables the multivolume feature for restoring backups made using the -M option of dump. |
-v | Lists the name of the file and its type after restoring each file. |
-V | Enables reading multivolume mediums other than tapes such as CD-ROMs and floppy disks. |
-y | Continues processing the restore command even if errors are encountered. |
-b blocksize | Specifies the number of kilobytes per record. If not specified, the restore command automatically determines the blocksize. |
-D filesystem | Specifies the name of the file system to be used when checking the file system. |
-f file | Reads the backup from a file, such as a tape drive or a disk drive. |
-F script | Executes a script at the beginning of each tape. This script is used to determine whether a new tape needs to be inserted into the device for the restoration to continue. |
-L limit | Specifies the number of mismatches that can occur when comparing files. When this number is crossed, the restore procedure is aborted and an error message is shown. The default value is 0, which disables the check. |
-s fileno | Reads from the specified file number on a tape that contains multiple files. |
-T directory | Specifies a directory for temporary files. The default directory is /tmp. |
Using the Restore Command in the Interactive Mode
In the interactive mode, the syntax for the restore command is:
restore -i [-achklmMNuvVy] [-A file] [-b blocksize] [-f file] [-F script][-Q file] [-s fileno]
[-T directory]
The parameters of the syntax are listed in Table 2-5-19:
Parameter | Description |
---|---|
-a | Reads all the volumes of the backup starting with the first volume. |
-A file | Reads the table of contents from the backup instead of the media. |
-h | Extracts the entire directory. |
-m | Extracts the files by inode numbers instead of by file name. Inode numbers are system-assigned identifiers that uniquely identify a file. |
-N | Performs a full restore operation without writing any file on the disk. |
-u unlink | Removes existing files when new files are created. |
-Q file | Uses the Quick File Access file to read the tape. This can be used when restoring from local or remote tapes. |
The interactive mode of the restore command also supports interactive commands. These commands are listed in Table 2-5-20:
Parameter | Description |
---|---|
add | Adds a file or a directory to a list of files to be extracted. |
cd | Changes the current directory being viewed within the backup. |
delete | Deletes a file or a directory from the list of files to be extracted. |
extract | Extracts marked files and directories from the backup to the system. |
help | Lists a summary of the available commands. |
ls | Lists the contents of the working directory. |
pwd | Prints the complete path of the current working directory of the backup. |
quit | Exits the interactive mode of the restore command. |
setmodes | Sets the mode, time of the previous modification, and owner for the directories that have been added to the list of files to be extracted. It does not extract the files. |
verbose | Shows information pertaining to each file that is extracted from the backup during the extraction. |
Using the Restore Command in the Quick File Access, Restart Full, and List Modes
In the Quick File Access mode, the syntax for the restore command is:
restore -P file [-achklmMNuvVy] [-A file]
[-b blocksize] [-f file][-F script] [-s fileno]
[-T directory] [-X filelist] [file ...]
In this syntax, the [–X filelist] parameter specifies that files to be extracted from the backup should be listed. The remaining arguments are as listed in Table 2-5-18 and Table 2-5-19.
In the restart full mode, the syntax for the restore command is:
restore -R [-cklMNuvVy] [-b blocksize] [-f file]
[-F script] [-s fileno][-T directory]
The parameters of the syntax are listed in Table 2-5-18 and Table 2-5-19.
In the list mode, the syntax for the restore command is:
restore -t [-chklMNuvVy] [-A file] [-b blocksize]
[-f file] [-F script][-Q file] [-s fileno]
[-T directory] [-X filelist] [file ...]
The parameters of the syntax are listed in Table 2-5-18 and Table 2-5-19.
System Logs
Logs are files that record all activities taking place in the system. Logs that record the activities of programs and applications running in the system are called application logs. Logs that are created and maintained by Linux and record the activities of the system services, are called system logs. For example, the Apache Web server creates logs to track the activities of end users using its services, such as Web site hosting. Linux creates a log to track the activities of the Web server as well as the end users.
System logs are automatically created to maintain an updated record of the activities of various programs and daemons. They provide a real-time indication of how the system is working and cannot be altered. Scanning through log files helps quickly identify the source of problems. As a system administrator, you should ideally view the log files while troubleshooting.
You can customize the size and location of logs on the system. By default, log files are stored in the /var/log directory. The actions or activities of the services are sent to the log files as messages. Instead of storing them in a single, large log file, Linux organizes the messages into smaller files based on the service that sent the message.
The main system logs on a typical Linux installation are:
-
/var/log/bootlog: Messages from the bootup sequence are listed according to date and time. The most recent bootup messages are listed near the end of the file.
-
/var/log/cron: Provides information about tasks that were started and the corresponding time. The cron command is used to schedule tasks.
-
/var/log/dmesg: Messages from or about devices. This is useful for identifying and solving hardware problems.
-
/var/log/maillog: Monitors all e-mail messages.
-
/var/log/messages: Monitors all error messages.
-
/var/log/secure: Monitors logons. You can track the activities of the end users to detect any unauthorized access to the system.
Viewing Log Files Using Commands
Log files are text files that can be opened and viewed using various commands. To view system log files, you must log on as superuser and your working directory should be /var/logs. This directory contains all the system log files. You can use numerous editors, such as EMACS or VI editor, to view the log files. To scroll through a particular log file, use the less command. For example, to scroll through the messages.log file, the command is:
less /var/log/messages.log
The less command enables you to scroll through the log file using the up and down arrow keys and the Page Up and Page Down keys.
To view the most recent entries made in a log file, you need to scroll to the end of the file. At times, the file may be extremely lengthy. You can use the tail command to view the last ten lines of the log file, as:
tail /var/log/messages.log
To view the first ten lines of a file, use the head command as:
head /var/log/messages.log
To view more than ten lines, use the –n option with the head or tail command as:
tail -n 25 /var/log/messages.log
| Note | The head and tail commands do no support scrolling up or down the log file. They are used only for a preview of the log files. |
Viewing Log Files Using the System Log Viewer
You can view the system logs using the GUI tool, System Log Viewer, which is available with the Mandrake Linux distribution. To start the System Log Viewer application:
-
In the K Desktop Environment (KDE), click the Start Application button, K->Applications->Monitoring->System Log Viewer. An Input dialog box appears, as shown in Figure 2-5-9:
-
Specify the root password in the Password for root text box and click OK. The System Log Viewer window appears, as shown in Figure 2-5-10:
-
To view a different system log file, click File->Open Log option. This opens the Open new logfile dialog box, as shown in Figure 2-5-11:
-
Select the log file to be opened and click OK. This opens the selected log file.
To monitor a particular log file:
-
Click File->Monitor option in the System Log Viewer window shown in Figure 2-5-10.
-
Click Monitor. This opens the Monitor options window, as shown in Figure 2-5-12:
>." id="IMG_51" src="http://images.books24x7.com/bookimages/id_5049/ch05fig12.jpg" title="Click To expand" border="0" height="115" width="251">
Figure 2-5-12: The Monitor Options Window
-
Select the log file to be monitored and click OK. This opens the Monitoring logs dialog box, as shown in Figure 2-5-13:
To view log statistics, click View->Log Stats option in the System Log Viewer window shown in Figure 2-5-10. This opens the Log stats dialog Box as shown in Figure 2-5-14:
Maintaining logs involves regularly deleting some records from the log files so that they do not get too large. You can open the log file in VI editor and scroll to the top of the log file. To delete the rows, use the command:
Escdd
The number_of_lines parameter specifies the number of lines to be deleted. You can undo any changes you make to the log file using the Esc u command.