ukopp user guide   v. 3.8

concepts

first tryout              (1-page primer for those with RTFM problems)

toolbar buttons

file menu
backup menu
verify menu

report menu

restore menu
format menu
editing backup jobs
technical notes


License and Warranty
Ukopp is a free program licensed under the GNU General Public License V3 (Free Software Foundation). Ukopp is not warranted for any purpose, but if you find a bug, I will try to fix it.


Origin and Contact

Ukopp originates from the author's web site at: http://kornelix.squarespace.com/ukopp
Other web sites may offer it for download. Modifications could have been made.

If you have questions, suggestions or a bug to report:
kornelix@yahoo.de



Introduction


Ukopp
is a Linux program for copying or backing-up disk files to a separate storage device, e.g. a USB drive or SD memory card. Any disk directory may also be used as a backup location. You can select files to be copied using a GUI. You can navigate through the file system and select files or directories to include or exclude at any level in the directory hierarchy. These choices can be saved in a job file to automate recurring backups. If new files appear in an included or excluded directory, they are automatically taken into account. You need to revise the job file only if you change the directories or make new exceptions within those directories.

Ukopp
copies only new and modified files: files that have not changed since the last backup are bypassed in microseconds. A typical daily backup of personal files can be done in less than a minute. Ukopp can optionally retain previous versions of backup files instead of overwriting them with newer versions. You can optionally specify the retention time and / or the number of versions to retain for each group of included files. You can see these versions in the backup directories and recover them if needed.

Ukopp has a
synchronize function, which is a simple method to keep files in two computers synchronized using a USB stick or other portable memory. Ukopp copies the newest version of a file from one device to the other.

Backups can be verified three ways: full, incremental, and compare. A
full verify reads all the backup files and reports any files having read errors. An incremental verify reads only those files that have been newly written by a preceding backup job. This is very fast and provides a high level of security. A compare verify reads all backup files and compares them with their corresponding disk files. This is normally not necessary, but provides an effective check that all hardware and software is working correctly.

You can report all files in a backup job, or all files in a backup directory. You can search for file names using wildcards. You can report the differences between backup files and their corresponding disk files: files that have been created, deleted, or modified since the backup was made. These reports are available in three levels of detail: a list of all changed files, total file and byte counts per directory, and overall totals.


For disaster recovery or file transfer, ukopp has a
file restore capability. You can select and restore backup files to their original directories or anywhere else. Owner and permissions are are also restored, even if the backup device uses a Microsoft FAT file system (initial state for USB sticks).



Concepts

The files in a backup job are specified with include and exclude records. These have filespecs with optional wildcards placed almost anywhere.

Examples:

    include /home/*                 # add all user files
    include /root/*                 # add all root files
    include /shared/*/documents/*   # add shared document files
    exclude */mp3/*                 # remove files in mp3 directories
    exclude */.Trash/*              # remove trash files

The first include adds all files owned by users in their home directories and sub-directories. The second include adds all files owned by root. The third include adds all files under the /shared top directory that also have an intermediate directory named /documents. The two exclude records remove all files within all /.Trash and /mp3 directories.

GUI interface:
The above records are normally generated using a file selection dialog.
This is documented in a following section: editing backup jobs.


File Selection Logic

    loop:
        get next include/exclude record

        if EOF, done

        if include: add all matching files to backup file set

        if exclude: remove all matching files from backup file set
    loop-end


Note that excludes are effective only against prior includes. They have no effect on following includes, which are processed afterwards. See the section on editing backup jobs.

Restriction: include records must include at least the first directory name (top-level) without wildcards (the GUI file-chooser does this automatically).


Retaining multiple file versions: if this option is elected, existing backup files that need updating are renamed with a version number instead of being overwritten. If the backup file "foo.bar" is updated, it is renamed to "foo.bar (1)", and "foo.bar" becomes the newest backup. If it is updated again, "foo.bar" is renamed to "foo.bar (2)", and so forth. Newer versions have higher numbers, and the unversioned file is always the current or latest version. The section on editing job files explains how to specify old version retention policies.

Ukopp limitations

    max. 200,000 files in a backup job (compile time constant)
    max. file retention is 9999 days and 9999 versions
    must run as root user or use sudo to copy protected files
    may need to run as root user to mount backup devices

    not useful for disk imaging (operating system backup)



Ukopp first tryout


After installing ukopp, please perform the following short exercise. This may be all you need at first. You can enhance your file security and ultimately save time if you read this whole document.  
The exercise will check that ukopp functions correctly on your system and help you become familiar with ukopp usage.
  1. Choose a backup device or directory. If using a pluggable device (e.g. USB drive), plug it in.
  2. Start ukopp: click the desktop launcher or input a terminal command:
    - if no privileges needed: $ ukopp
    - if privileges are needed: $ sudo ukopp
  3. Select button [ target ]. The drop-down list shows disk devices and their mount points if mounted. You can choose one of these, or input your chosen backup directory.
  4. Select button [ mount ] if you need to mount the backup device. Check that the selected target device/directory mounts OK.
  5. Select button [ edit job ]
  6. Erase the default backup job shown (select and delete, or use the [ clear ] button)
  7. Select the button [ browse ] at the bottom
  8. Navigate through the directories and select the directories and files to be copied
  9. select the [ done ] button when finished selecting files
  10. inspect the generated include and exclude records
  11. Set a verify method: choose one of none, incremental, full, or compare. Use compare until you are confident that everything is working, then speed things up later by changing to incremental.
  12. Select button [ done ] when finished editing the job
  13. If there are errors shown, select [ edit job ] and fix them (remember that exclude records must follow relevant include records - excludes are exceptions to prior includes)
  14. Select menu: Report > get disk files. Inspect the counts. Be sure the total byte count is within capacity. Look for zero counts, indicating possible errors. Re-edit if needed.
  15. Select button: [ run job ]. Backup and verify should run automatically.
    Check that the error count is zero.
  16. Save the job file if desired: menu: File > save job
  17. Select button: [ quit ]
  18. Next steps: play with the report and restore functions.

Detailed Usage Instructions


Toolbar buttons


root
This button restarts ukopp with root privileges if the password (sudo) is OK.

target
The drop-down list displays all drives that are visible to ukopp, with their mount points (if mounted) and descriptions. Choose one of these to set the target device and location for a subsequent backup. You may also type-in a directory directly. This must be a valid directory for which you have write permission, and of course there should be enough space for the backup files. If an unmounted device is chosen, it will be mounted at a new directory within /media and this will be the target backup directory unless you change it. If the job file has a target device and directory specified, ukopp will attempt to mount this device and directory when the [mount] button is pressed.

edit job
Shortcut to the backup job editor (same as menu File > edit job)


run job
The current job is executed.


pause / resume
The currently running job or menu function may be paused and resumed. Use this to inspect output on the fly.


kill job The currently running function is killed.

clear
The main window, where messages and reports are written, is cleared.


quit
Exit ukopp.




File Menu


open job

Open a previously saved backup job file for re-use (edit, run). Default location is the hidden directory /home/user/.ukopp (or /root/.ukopp).


edit job
Opens an edit dialog for the current backup job (the last job file opened, or from a prior edit). If no file has been opened, internal default data will be used as a starting point.


show job

List the current backup job data and diagnose any errors.


save job

Save current backup specs in a job file. Default is the same file that was last opened, but you may select any file. The data includes any edits that were made to the job.


run job

The current backup job is executed. Backup and verify modes are taken from the job.




Backup menu


backup
The backup job is run without verify. You can then run whatever verify you want.


synchronize

This is a bi-directional copy. Files present on one side only (disk or backup location) are copied to the other side. Files that are present on both sides will get the newest version copied to the other side. "Newest" is based on the time of the last file update.


Assume you normally use computer A, but you need to use B while traveling. You can use a portable memory device (SD card, USB stick, etc.) to keep the computer files synchronized.
  1. A and B must have identical backup job files, naming the same set of backup files.
  2. Initial synchronization: backup A, move the memory device to B, restore to B.
  3. Work with B: create and modify some files.
  4. Run synchronize on B, move the memory device to A, run synchronize on A.
  5. The modifications done on B are now carried over to A.
  6. You can update files on both A and B in parallel, as long as you work on different files between synchronizations. Synchronize A, then B, then A. Now both will have the same set of files, and these will be the newest ones present on either A or B.


Verify menu


full

All backup files are read and checked for errors.


incremental
New backup files are read and checked for errors. "New" means any files written by an immediately prior backup. Files not modified are not checked.


compare

All backup files having the same modification time and size as their corresponding files on disk are read and compared with the disk. There should be no differences. This verifies that ukopp is working correctly. Other files are read and checked, but not compared to disk.




Report menu


get disk files

The backup job include and exclude records are listed, along with the file and byte counts that are added or removed. Look for zero counts, indicating a possible error. The disk directories are read at the time this command is executed, and the list of files included in the backup job is retained in memory. This data is used to determine which backup files are now out of date and must be copied again from disk. The file list is static and is not updated by disk activity. The list of "new" files for a subsequent incremental verify is also reset.


diffs summary
Report the total number of files in each category:

    new disk files with no corresponding backup file

    modified both files exist, but are not identical

    deleted backup files with no corresponding disk file

    unchanged both files exist and are identical


Differences between the disk and the backup files may be caused by disk updates (file additions, deletions, updates, or moves), or by changes to the backup job file itself.


diffs by directory
The above counts are reported for each directory having any differences between the disk and backup files.


diffs by file

List all different files, grouped in the first three categories above. If a file is present on both the disk and the backup location, and the backup file is newer than the disk file, then the file is flagged in a way that is easy to see. This can be normal if you use the synchronize function.


version summary

List backup files having old versions retained, with the range of versions and file ages (days) available. File age is days since the file was modified.


expired versions

List backup file versions that are expired and will be purged from the backup medium or location with the next backup run.


list disk files
All files in the backup file set are listed in alphabetic sequence. Use this to check that the correct files are being backed-up.


list backup files
All backup files are listed in alphabetic sequence. A summary of the space used for prior file versions is also provided.


find files
Enter a search pattern with optional wildcards (e.g. /home/dir*name/file*name).
All matching disk files and backup files are listed.


save screen

The main window, where messages and reports are written, is saved as an ordinary text file.
 


Restore menu


setup restore job

Specify the copy-from location (in the backup files), the copy-to location (disk), and the files to be restored.

The copy-from location is the topmost directory of a tree of files to be restored.

    example: /home/joeblow/documents   # backup device mount point is omitted
The copy-to location is an existing disk directory where the tree of files will be copied-to.

    example 1: /home/joeblow/documents
    example 2: /home/joeblow/documents/restored

In example 1, the restored files will go back to the same place they were when backed-up.
In example 2, they will go to a new place.

Files to be restored are specified the same way as in a backup job (see the section below on using the file selection dialog).

If you need to restore multiple trees of files, you can do this in multiple runs, or you can simply begin the tree at a higher level and use the file selection dialog to specify multiple sub-trees, with included and excluded branches.

list restore files
After performing the file restore setup above, use this function to list all matching files that will be restored, at the locations where they will be restored. You should check this list carefully to be sure you are restoring the correct files to the intended locations.


restore files
When you are satisfied with the restore job specification, use this menu to start the restore. You will see a running log of the activity. The file owners and permissions are automatically restored, even if the backup files are on a FAT file system.



Format menu


format device

This is a convenient way to initialize a portable memory device such as a USB stick or SD card for use with ukopp. You may select the vfat (Microsoft) or ext2 (Linux) file system. You may choose from all known devices, mounted or unmounted. You may also choose a device label which will show under the device desktop icon if automatic mounting is enabled. Before format begins, you are shown which device will be formatted and given an opportunity to stop. Be sure you format the correct device, since all data on this device will be lost!


Microsoft vfat works somewhat faster than ext2 for USB devices, for reasons not clear to me. The disadvantage is that some of the strange file names typically found in Linux hidden directories are not vfat compatible and will not copy (error messages are produced and the backup job continues). Use ext2 if you must copy these files. Use vfat if you must exchange files with a Windows computer.



Editing backup jobs


The [edit job] button starts the job edit dialog. See the screenshot below.


include and exclude records

You may edit the backup job (the include and exclude records) directly in the text window. You may also use the browse button to start a file selection dialog. The dialog has the buttons [include] and [exclude]. The "show hidden" checkbox turns the display of hidden files on or off. Select one or more directories or files, using left-mouse or Ctrl+left-mouse, then press the [include] or [exclude] button. The selected directories or files will be written into the text window as include or exclude records. If you select a directory, the entry is modified to add a wildcard at the next level, e.g. selecting /aaa/bbb/ccc and then pressing [include] generates include /aaa/bbb/ccc/*.


The include and exclude records allow precise control of the backup file set, allowing you to quickly converge on the desired results:

    include /aaa/bbb/*             # include file tree under /aaa/bbb/

    exclude /aaa/bbb/ccc/*         # exception: exclude the /ccc/ subtree

    include /aaa/bbb/ccc/xxx.yyy   # exception: include file /ccc/xxx.yyy


Because of wildcards, newly added files within the scope of existing include or exclude records are automatically comprehended. In the above example, if a new file is added in /aaa/bbb/* then it will be automatically included in the next backup job.


old file retention policy
You may optionally enter a retention policy for old backup files. If there is no retention, a modified disk file replaces the corresponding backup file, and a deleted disk file causes the corresponding backup file to be deleted. If you wish to retain previous file versions, you must specify a retention time in days, and a retention version count. The values in the GUI dialog (days and versions) apply to each file that is selected when the [include] button is pressed. Old file versions are deleted when they are older than BOTH retain rules: if retention is D days and V versions, old file versions will be deleted only when older than D days, and not within the latest V versions. The latest version is never deleted. You can disable either of the limits by using zero (retain zero versions or zero days).

Here are some examples that will hopefully make this clear:

    retain 10 days and 3 versions: delete versions older than 10 days but not the 3 newest
    retain 10 days and 0 versions: delete all versions older than 10 days
    retain 0 days and 8 versions: delete all versions older than the 8 newest versions

If a retention policy is given, the include record in the text box has "(ddd,vvv)" appended to it, where ddd is the retention days and vvv the version count.

You may alternate between editing the text window and using the file-chooser dialog. When you are done, press [done] to accept. The include / exclude records will be validated to the extent possible. Re-edit to fix any problems. To change the sequence, cut and paste in the text window. When you are done, use the report functions "get disk files" and "list disk files" to verify that you have the correct files!


choose target
button
This works like the [target] button on the toolbar, described above.

verify method
Choose one of the radio buttons to determine how ukopp verifies that the backup copies are free of errors.

none
no automatic verify after backup (use the verify menu instead)
incremental
verify all files copied by the backup job (i.e. new and modified files)
full
read all backup files to check data integrity
compare
full + compare all backup files to corresponding disk files (if present)

ukopp job edit dialog

Summary: The [ edit job ] toolbar button pops up the middle window. This can be edited directly: click anywhere in the text area and start writing. The right window is the choose files dialog, which is started with the browse button in the middle window. Choose files using the right window, and the middle window records your choices. You can navigate around the directory hierarchy and select any number of files or directories. The hidden button toggles the display of hidden files. Click one of the include or exclude buttons to get the selected files added to or removed from the backup list. Selecting a directory is an implied selection of all its contained files, thus the selection appears as /directory/* in the list of selected files. To make an exception, go down one level, select some files, and select the opposite include or exclude button. You can refine the file selections manually if desired. It is sometimes handy to use wildcards in the directories to make more general and compact selection criteria.
     example:   exclude /home/*thunderbird*/Trash
This would exclude trashed e-mail even if the overlying directories change (they do) and even for multiple users.


You can add comments (or disable a record) by putting # in column 1.


Annotated example of a backup job file

This is an example of what one might do to backup all personal files. In this example, we avoid backing up stuff that is not important (browser cache) or stuff that can be automatically regenerated (gnome thumbnails). Two old file versions should be retained up to 10 days. All files copied during this run should be read and verified. Files not copied (because they have not changed since the last backup) are not verified. The backup target or location is a USB disk that, when plugged-in, mounts at /media/disk (which can be changed at run time if desired).

    include /home/rosi/* (10,2)   # include Rosi's personal files
    exclude */.thumbnails/*       # omit gnome thumbnail files

    exclude */firefox/*Cache/*    # omit the browser cache files

    verify incremental            # verify files copied by each run

    target /dev/sdf1              # use removable USB disk
sdf1

The above backup job can be created using the following steps:
("xxxxx" means the random directory name that firefox generates for a user)



Technical Notes


Symlink files:
starting with version 3.0, symlink files are no longer discarded, but treated like regular files. They are copied if included in the backup job. The target file of an included symlink is NOT automatically included. A target file is included only if it's own file name is included in the backup job. Symlinks are verified by checking they are readable using function readlink(). If the target file system is vfat, symlinks will not copy and will be reported as errors.

Running ukopp as root: ukopp will only copy files for which the user has read access. If files belonging to root or other users are to be copied, you must run ukopp as root. Use "su" or "sudo", or log in as root (see the note below about making a launcher to handle this).


Command line arguments:

    $ ukopp -job jobfile           # load job file
    $ ukopp jobfile                # load job file
    $ ukopp -run jobfile           # load job file and run it

    $ ukopp -nogui -run jobfile    # run as batch job without window

If the jobfile name contains blanks, quotes are required, e.g. $ ukopp -job "my ukopp job"


The  -nogui  option can be used for a pure command line job that has no window and will not ask for any user inputs. This can be used for deferred execution (cron job). The backup location must be available at the time the job runs.

File type association:
I suggest using the extension .ukopp for job files and specifying ukopp as the "start with" program. Then you can click on a job file and launch ukopp.


Desktop launcher:
a desktop icon / launcher may contain a command like this:

     gksu /usr/local/bin/ukopp -job myjob.job
"gksu" will ask for the root or administrator password and run the job as root.


Incremental backups:
a backup file is considered identical to its corresponding disk file if their lengths and modification times are the same. Incremental backups exclude such files. If the modification times differ by less than 1 second they are considered equal. 1 second is the time resolution for a Microsoft vfat (FAT32) file system, usually present by default on USB drives.


Restoring file owner and permissions:
A detachable drive file system may not support Linux file owner and permissions (e.g. Microsoft FAT). The ukopp backup function copies a special file to the backup location, with the data needed to restore file owner and permissions. The ukopp restore and synchronize functions use this file.


Special ukopp files:
A directory named ukopp-data is written to the backup location.
It contains the following three files:

    datetime            backup date-time
    poopfile            owner and permissions data for all files
    jobfile             a copy of the backup job file used

These are ordinary text files which you can view with an editor.

Special file types:
pipes, devices, symlinks, and sockets are not copied.


Duplicate files:
If job file "include" records overlap, resulting in duplicate files in the backup set, this is reported and the backup does not proceed.


Finding disk drives: the Linux utility udevinfo is used to find block devices with the characteristics "disk". The file /etc/mtab is used to find mount points.

Removing detachable drives:
To remove a detachable drive, right click on its icon, select "unmount" or "eject" or "safely remove", and wait for the "OK to remove" message, or the LED on the drive to stop blinking. Pulling the drive out without doing this can result in data corruption or total loss.

File system cache: If the backup device was mounted by ukopp, it will be remounted after a backup and before the verify begins. This causes all the cached file data in memory to be written to the backup medium before the program can proceed. This is done to assure that the verify function is reading from the medium and not from cache memory, which would be pointless for verifying the medium. If the backup device was already mounted, the "sync" command is used to insure the file cache is written to disk. In either case, the verify function uses direct I/O to read files directly from the medium instead of the memory cache. This is not slower for large block I/O, and provides additional insurance that the data on the medium is valid.

NTFS:
this Windows file system can be used as a backup source, but not as a backup target. This is because the Linux driver for NTFS fails for a file open() function call with the attribute O_DIRECT, meaning direct I/O that bypasses memory caching.

Linux error codes: Linux error codes can be misleading. If an attempt is made to open a file that is already open and is therefore locked, the error text is "no such file or directory". I have noticed several such screwups in Linux. This will hopefully improve over time.

Funny file names: Disk drives formatted with the vfat file system (Microsoft FAT) will not accept some Linux file names. Notably, files names containing " : " or " ? " or ending with a blank will fail to copy, and this will be reported in the backup job. Unless you need Microsoft compatibility, format the drive with ext2, or avoid copying the weird file names you can find among the hidden files in your home directory.

Retention and version limits:
The retention upper limits are 9999 days and 9999 versions. As an example, if the version limit were set to 100, retained versions for a file could reach 9899 to 9999 before ukopp stopped working. These limits are easy to increase, but performance would start to deteriorate long before this. If you reach 1000 retained versions it is time to start over (erase the backup medium).