Backing up files with the XCOPY command

(Click the topic to expand/collapse it)

Introduction

You already know why you need to have a backup strategy: you've already lost data when your disk failed. Now you need to know how to implement a backup strategy that will work for you.


What are the issues with backups?

First, how much new data do you create, and how often do you create it? Second, how long will it take to recreate the lost data when your disk drive dies again (1) if you don't have a backup; and (2) if you do have a backup.

You also want to decide on the backup media: floppy diskettes; CDs; DVDs; a network drive; a removeable hard drive; a USB flash drive; tape media; and so on. Important factors to consider are speed, price, portability, ease of use, flexibility.

Consider also the kind of backup: Full backup, incremental backup, or differential backup. These distinctions are not obvious without an explanation. A full backup backs up all of the selected files unconditionally. An incremental backup only backs up selected files if they have been modified since either the last full backup or the last incremental backup. A differential backup backs up all the select files that have been modified since the last full back up or incremental backup ... but the differential backup doesn't reset the backup flag, so they are again backed up in the next differential backup. For a more detailed explanation, see "3. The kind of backup" below.

The backup program that is to be used is the next factor to consider. The choice of program will affect all the other factors.

The storage location of backup media is also very important.

All of the above factors come into play, and will be briefly explained below. But first, a recommendation:

For Microsoft (R) Windows (R), see "Using the XCOPY command" below.


1. How much new data do you create

Let's consider how much data you generate and might possibly want to restore. Suppose that you work a 40 hour work week but you don't use your computer to create new data very often in your work. If the amount of new data you create is small and is also easily recoverable by data entry from paper records you may have, it may be that a simple monthly backup will suffice.

If you routinely create new, important documents or update existing documents, a weekly and daily backup in addition to a monthly backup is a good idea. For those who fall between the extreme cases of very little data updates versus frequent new, imporant data, maybe a monthly and weekly backup will work.

The key on frequency of backup is the amount of loss you can endure without making a big impact on your work when your disk drive dies ("when", not "if"). Figure out how much loss you can live with. Decide accordingly.


2. Decide on the backup media

Now, how hard is it to hook up the backup media to the computer? How long do you need to keep the backup media? How fast do the backup and, equally important, the restore program run? The only method that will be proposed in this document is using an external USB hard drive. That is not the best solution for every need, but it's the best for many situations. It's fast, portable, easy to connect to a computer, and is fairly stable in the long term. But there are other methods that are better in some situations than external USB hard disks. You can get assistance figuring out other methods by emailing or calling Gene Wiggins Consulting


3. The kind of backup

With a backup program you decide what files should be considered for being backed up. Then you use some criteria to make the final decision of whether or not to actually back up each of the candidate files. The first part, deciding what files should be considered, is straightforward. There's no reason to back up operating system files. They change between versions of the operating system and between different computers that have ostensibly the same operating system. But data that you create yourself, such as documents containing letters to your clients, spreadsheets that analyze financial information, and other high-value data you create yourself, should be candidates for backups.

The second part of the selection is to establish some criteria to distinguish between files that need to be backed up and those that don't need to be backed up. Huh? Isn't that what was just discussed? No. This second criteria will almost always be whether or not the file has already been backed up since the last time it was modified. You won't need to back up a file again if you haven't changed it since the last time it was backed up.

Fortunately, the operating system keeps track of when each file was last modified, and whether or not the file has been modified since it was last backed up. The backup programs can look at each of the files that have been identified as candidates for backup and decide if they have been modified since the last time they were backed up, then elect to backup only the individual files that have changed.


  3a. Full backup

A full backup unconditionally backs up each candidate file (without respect to the last time it was modified, and without respect to whether it was modified since the last backup.) Then the backup program turns off the backup flag for the file. In Windows (R) the flag is called the "archive" flag. Turning off the archive flag insures that the file doesn't get backed up again by either an incremental or differential backup procedure. Note that the full backup doesn't depend on the state of the archive flag, but does turn it off, whether it was on or off before the backup.

Full backups take more storage space than incremental or differential backups because in a full backup, every candidate file is selected then backed up even if it has been backed up since it was last modified. The advantage of a full backup is that it give a baseline for future incremental or differential backups.

If a disk drive fails immediately after a successful full backup, the data can be loaded onto a new disk drive and put to work immediately. Only one restore operation (of the data) is required. No additional restorals (other than operating system files, if the data files are stored on the operating system disk) are needed. One operation restores the data files. When there are incremental and/or differential backups, the full backup has to be restored, then any required incremental or differential backups have to be restored.

Why not do a full backup each time? On a computer with lots of files, it can take an hour or longer to do a full backup. Also, the storage space for a full backup can be very high.


  3b. Incremental backup

An incremental backup looks at each candidate file to see if the archive flag (see Full Backup above for an explanation of the archive flag) is set. If it is set, the file gets backed up. Additionally, the incremental backup program sets the archive flag off for the files that it backs up. Setting the archive flag off prevents the file from being backed up again by an incremental backup (or by a differential backup) program. The archive flag will get set on only if the file changes again, or if a special program turns it on.

The incremental backup is typically used as an adjunct to a full backup. A common backup scenario for low volume computer useage is to perform a full backup monthly then do an incremental backup weekly. If disaster strikes, only the most recent week of work will be lost. So if it's the 27th of the month, and you have the most recent full backup, plus three incremental backups, you can recover your data by restoring from the full backup, then restoring each of the three incremental backups.

The reason that the three incremental backups are needed is that files in the first weekly incremental backup are not backed up in the second incremental backup. They were backed up the first week, then flagged as having been backed up, so they don't get backed up in the second incremental backup. Likewise, the files from the third incremental backup don't include any unchanged files that were backed up in the first or second incremental backup.

Clearly, restoring a full backup and set of several additional incremental backups is time-consuming. That's why differential backups are sometimes used. But incremental backups have a definite use.

A second benefit of incremental backups is that a set of files can be restored to the state they were in at a particular point in time. If a project changes significantly, the work done since a specific point in time may be irrelevant. Data files can be restored to their condition at the time of the full backup. Then incremental backups can be restored in sequence up to the date when the project changed.

One disadvantage of an incremental backup is that several sets of backup media must be maintained. The full backup must be kept, regardless of the backup strategy. Then all the incremental backups since the last full backup must be kept available. In many cases this is an onerous burden, in others, it is not.


  3c. Differential backup

The differential backup is also a useful method. It, of course, depends on a full backup having been done. Furthermore, it is used with the assumption that only differential backups will be used in addition to the full backups.

With a differential backup, only those files with the archive flag (see Full Backup above for an explanation of the archive flag) set are backed up. So far, this is the same as with an incremental backup. But the difference with the differential backup is that after the files are backed up, the archive flag is not set off. In other words, the file will again be backed up by the next backup procedure.

If a differential backup strategy is used, the backup media will use two backup sets: the full backup, and the most recent differential backup. If things have been done properly, that is to say, no incremental backups have been performed intermixed with full and differential backups, only two sets of backup media are required to restore the files. Specifically, with the differential backup approach, to restore to the most recent files, it is necessary to restore the most recent full backup and the most recent differential backup.

The advantage for the differential backup is that fewer backup sets are needed: the full backup and the most recent differential backup. The disadvantage of differential backups is that every file modified since the last full backup gets backed up every time a differential backup is peformed. That's because the differential backup program doesn't ever set the archive flag off. It usually takes more time than the incremental backup because the number of files with the archive flag set increases without getting reset in the differential procedure. On a day-to-day basis (assuming that backups are done daily) the backup takes more time for a differential backup. Every file changed since the last full backup is backed up every day, plus any new files created since the previous day's differential backup also get backed up. With an incremental backup, only those changed since the last full backup OR last incremental backup get backed up.


  3d. Deciding between incremental and differential backups

The decision between incremental and differential backups is often based on the the length of time to do backups on a daily basis versus the length of time it will take to recover files at the time of reinstalling the files on a new disk drive (or reinstalling on an existing drive). If you create 10 new files each day and also modify 10 existing files daily (not the same as a recently created "new" file), then you do a backup daily, the amount of time required to perform the backups will vary by backup type.

With an incremental backup, each day, you'll backup up 20 files: the ten files that already existed and the ten new files created that day. With a differential backup, you'd back up the ten files that already existed, plus the ten new files from that day, the ten files from the previous day, the ten files from two days previous, the ten files from three days previous, ... , etc.. With an incremental backup, it takes the same amount of time to back up the files from day to day. With a differential backup, it takes longer each day to do the day's backup.

When it comes time to restore files, the opposite condition exists. The full backup has to be restored, regardless of whether incremental or differential backups are done. Then the additional restorals are applied. For incremental backup sets, EVERY backup set must be applied because the most recent incremental backup only contains 20 files. On the other hand, with a differential backup, the full backup and ONLY the MOST RECENT differential backup set must be restored.

Both the incremental and differential backups require time to perform. The key is this: do you want to spend more time each day backing up your files, but less time if you have to do a restore, or do you want to spend less time daily, but spend more time when a restore operation is needed? That will determine which method you will use for daily (presumably) backups. Both are good. Both have advantages and disadvantages. If you need help figuring out which is better for YOUR situation, contact Gene Wiggins Consulting .


4. Backup program that is to be used

You can use any of a large selection of backup programs. They vary in complication and usefulness. For Microsoft (R) Windows (R) operating systems, be careful with the backup program that is provided free with the operating system. Every time Microsoft updates their operating system, they ship a different, incompatible version of the backup program. Their program integrates nicely with their operating system, but is almost always unuseable in the next version of their operating system.

Fortunately, Microsoft includes the XCOPY program in every version of their operating system. And fortunately, the XCOPY program works with files backed up with XCOPY from any version of their operating system, starting with DOS 3.2, continuing through DOS 3.3, DOS 4.0, DOS 5.0, DOS 6.0, Windows 1.0, 2.0, 2.11, 3.0, 3.1, 3.11, NT 3.51, NT 4.0, Windows 95, Windows 98, Windows ME, Windows 2000, Windows XP, and Windows VISTA. The file sets created with every version of XCOPY also work with Unix, Xenix, Linux, Ultrix, Mac OS, CP/M, and virtually every other operating system that can figure out how to read a disk with files on them.

One disadvantage of XCOPY is that it doesn't catalogue backup sets. That is to say, the XCOPY utility doesn't automatically keep track of the location of each file backed up like many of the other backup programs do. It can use multiple backup sets, even diverse kinds of media for a backup ... if you want to put the effort into creating them. But you'll have to figure out where a specific file is if you only want to restore one file on a backup set.

There is another potential disadvantage of XCOPY: XCOPY doesn't break large files across individual media. This means that if you have a file that's bigger than the disk onto which you're writing it, XCOPY won't automatically break the large file into smaller chunks so that it can be spread across several disks (CDs, for example). In a few instances this is an important consideration. In others it isn't. If you have files larger than the capacity of the individual media units (650 Megabytes for a CD), you'll need to use a stand-alone program to break large files into smaller pieces, or use a different backup program.

Third party backup programs are available too. They tend to be more expensive than the Micrsoft-provided programs (free). Some are wonderful, others are hard to use. If you decide on a third party backup and restore program, research it well. Find out how it will fit into your situation. Figure out how long it takes to back up files, and what sort of machinations are needed to restore files from backup sets. If you need help, contact Gene Wiggins Consulting


5. Storage location of backup media

Ever hear of hurricane Katrina? That was a natural disaster. Companies who faithfully backed up their data and stored their backup media in New Orleans found that they couldn't use their "submersible" backup data because it was destroyed. Companies who kept backup media in another geographic region that wasn't flooded were able to set up computers in other locations and restore their data from backup file sets that were kept away from their New Orleans offices. Those were the smart guys.

Natural and unnatural disasters do happen. Fires destroy buildings. Sometimes backup media gets destroyed in fires too. Storing backup media in Minneapolis, London, Calcutta, Beijing, or Bankok is subject to fire as much as to flooding or fire in New Orleans. If your backup data is truly important, you ought to store more than one copy of the backup media, in more than one geographic region. If your backup data in your Moscow office burns down, it would be nice to be able to get a copy of your data from your Tegucigalpa office.

Obviously, it's important to have off-site backups. Your office may get destroyed. On the other hand, you might have a simple hard disk failure without your whole office being wiped out ... physically. For maximum protection, it makes sense to keep a backup set in your main office and in two geographically separate offices. The local backup set insures that you can make a fairly quick restore if your disk drive simply fails. But if your office burns down, your data can be retrieved from your Johannesburg office or your Havana office so you can get your business back up and running in a relatively short time.


6. Extra considerations

There are some issues from classic backup procedures that ought to be discussed here. Instead, they'll be mentioned in passing. If you want to discuss these topics, contact Gene Wiggins Consulting.

Verification of backup and restore. If tape media or other fragile media is used for backup storage, it's wise to verify the data backed up and restored. This is done by comparing the backed up data against the original files. For a restore, it's done by comparing the restored data against the backup media.

Periodic testing of media. Backup media is itself not indestructible, so it's wise to periodically test the media by performing a restore and verfication. If there are many errors, it's time to recover the data to a different location and back it up again to prevent data loss.

Files open during backup. Files that are open at the same time the backup is being done may not get properly backed up. Even if they do get backed up properly, when it comes time to verify the backed up data against the original files, the verification fails. That's because the files were still open during the backup, and changed after they were backed up.

ACLs and ownership. Files in modern file systems (NTFS in Windows NT, 2000, XP and VISTA) belong to some user and group. In addition to ownership, some files may be given an Access Control List (ACL). If the data is backed up onto a file system that does not support NTFS, the ACLs and ownership informtion may be lost. It might be necessary to adjust file properties after restoral.

Encrypted File System (EFS). NTFS file system supports encryption via EFS. EFS is dependent on the password of the file's encrypter for decryption. If an encrypted file is restored to a computer without the proper credentials, the encrypted files will be unuseable.

Encryption of backups. Some people have sensitive data and need to be encrypted. Backup sets can be encrypted, even separate from the EFS. It's the responsibility of the data archiver to implement encryption if it's needed.

Compression. It's possible to use compression to save space on backup media. Some backup procedures allow compression. If the compression is not compatible with the restoral program, for example, if the data is being restored to an operating system that can't read the particular compression, then the data in the backup will be unuseable.

Marking the media. This may seem obvious, but bears pointing out. Be sure to mark the backup media clearly so that it can be identified easily.

Folders to back up. Each version of Windows has a slightly different way of organizing folders (directories). You need to make sure that you designate all the folders to back up that are important to you. For example, Outlook Express saves email files in a folder other than "My Documents" for the user. Be sure to back up all the folders you need to back up. The same goes for bookmarks, and other application data.


7. Using the XCOPY command

For Microsoft operating systems, the XCOPY utility is wonderful. It implements every important feature of backups except cataloguing, and saving files that won't fit on a single backup media. Here is the command line you need to use. You can invoke it in a variety of ways explained later in this document.

Command lines (2) for FULL backup:

cmd.exe /k xcopy /c /e /f /h /i /s /v /y /z "source" "destination"
cmd.exe /c attrib -a *.* /s

Command line for INCREMENTAL backup:

cmd.exe /k xcopy /c /e /f /h /i /m /s /v /y /z "source" "destination"

Command line for DIFFERENTIAL backup:

cmd.exe /k xcopy /c /e /f /h /i /a /s /v /y /z "source" "destination"

Note that the "/k" can be change to "/c" which will cause the backup window to close automatically instead of staying open after the backup is done, as happens with the "/k" command line switch.

The "source" and "destination" are folders on the disk drives where the files to be copied (source) are backed up to (destination). It is useful to enclose the folders in quotation marks as shown, because in Windows, spaces are often used in folder names, and not using quotation marks confuses the XCOPY utility.

Here's a more realistic command line for an incremental backup:

cmd.exe /k xcopy /c /e /f /h /i /m /s /v /y /z "C:\Documents and Settings\Mortimer Snurd\My Documents\*.*" "F:\Backups\MSnurd"


Explanation of command lines

The first command lines, for the FULL backup are used as follows. First, the "cmd.exe" creates a command prompt window. The "/k" switch (the portion of the commands with a slash and additional subcommands are called "switches") for the "cmd.exe" command causes the command prompt window to stay open after the commands are executed. This may be useful if you want to see the messages written to the command prompt window afterwards. It's not necessary to do this, though. Instead of the "/k" switch, you can use "/c" for the "cmd.exe" command. This will cause the command prompt window to close automatically after the command line is finished.

The rest of the line gets fed into the command prompt interpreter. The first line is the actual XCOPY command. This is what does the work. Notice that the "/m" and "/a" switches are not used for the full backup as they are in the incremental and differential backups.

The second line of the full backup is used to clear the archive flags of all the files. Because the XCOPY command without the "/m" switch doesn't reset archive flags, and that's how the first line was invoked, we want to be sure that we explicitly reset all the archive flags. That's what the "attrib" command does in the second line.

The command line for the Incremental Backup uses only one line. It isn't necessary to reset archive flags with the "attrib" command. Note that there is a switch "/m" on the command line. The "/m" switch for XCOPY does two things. First it specifies that only files with the archive flag set be copied. Second, it resets the archive flag after the file is backed up.

The command line for the Differential Backup also uses one line. The command line switch for the differential backup is "/a" instead of "/m". The "/a" switch for XCOPY does one thing only. It specifies that only files with the archive flag set be copied. It does not reset the archive flag, though.


Invoking the XCOPY command

The XCOPY commands above can be activated/invoked several ways. One is to start a command line window, then type the command lines. Another is to create an icon and assign the command to the icon. Choose whichever way works best for your situation.


Restoring files from the backup media

Files backed up with some backup programs are stored in a special archival format. To restore files, the special backup program has to be used. Files backed up with XCOPY, though, do not require any special program for restoral. That's because XCOPY simply copies the files in their existing format to an alternate location. Restoral from an XCOPY backup can be done using command line COPY or XCOPY commands, or it can be done with "My Computer" or "Windows Explorer" (not the same as Internet Explorer.)

Simply locate the backed up file and copy it from the backup location to the location you want it to be. That's it!


For additional information, contact Gene Wiggins Consulting

xcopy.html