Added another tool used in dak (and placed nowhere else), dsync

[dak.git] / tools / dsync-0.0 / doc / filelist.sgml
diff --git a/tools/dsync-0.0/doc/filelist.sgml b/tools/dsync-0.0/doc/filelist.sgml

new file mode 100644 (file)

index 0000000..35483d1
--- /dev/null
+++ b/tools/dsync-0.0/doc/filelist.sgml
@@ -0,0 +1,709 @@
+<!doctype debiandoc system>
+<!-- -*- mode: sgml; mode: fold -*- -->
+<book>
+<title>DSync File List Format</title>
+
+<author>Jason Gunthorpe <email>jgg@debian.org</email></author>
+<version>$Id: filelist.sgml,v 1.4 1999/11/15 07:59:49 jgg Exp $</version>
+
+<abstract>
+</abstract>
+
+<copyright>
+Copyright &copy; Jason Gunthorpe, 1998-1999.
+<p>
+DSync and this document are free software; you can redistribute them and/or
+modify them under the terms of the GNU General Public License as published
+by the Free Software Foundation; either version 2 of the License, or (at your
+option) any later version.
+
+<p>
+For more details, on Debian GNU/Linux systems, see the file
+/usr/doc/copyright/GPL for the full license.
+</copyright>
+
+<toc sect>
+
+<chapt>Introduction
+<!-- Purpose                                                          {{{ -->
+<!-- ===================================================================== -->
+<sect>Purpose
+<p>
+The DSync file list is a crucial part of the DSync system, it provides the 
+client with access to a list of files and file attributes for all the files
+in a directory tree. Much information is compacted into the per-file structure
+that may be used by the client in reconstructing the directory tree. In spirit
+it is like the common ls-lR files that mirrors have, but in practice it is 
+radically different, most striking is that it is stored in a compacted binary
+format and may optionally contain MD5 hashes.
+
+<p>
+The file list for a directory tree may be either dynamically generated by the
+server or generated only once like the ls-lR files. In fact with a static
+file list it is possible to use the <em>rsync method</> to transfer only the
+differences in the list which is a huge boon for sites with over 50000 files 
+in their directory trees
+
+<p>
+Internally the file list is stored as a series of directory blocks in no set 
+order. Each block has a relative path from the base to the directory itself
+and a list of all files in that directory. Things are not stored recursively
+so that the client can have fixed memory usage when managing the list.
+Depending on how the generator is configured the order of the directories
+may be breadth first or depth first, or perhaps completely random. The client
+should make no assumptions about the ordering of anything in the file.
+
+<p>
+Since the list may be generated on the fly by the server it is necessary for 
+it to be streamable. To this effect there will be no counts or sizes that
+refer to anything outside of the current record. This assures that the 
+generator will be able to build a file list without negligable server side 
+overhead. Furthermore a focus is placed on making things as small as possible,
+to this end usefull items like record length indicators are omitted. This 
+does necessarily limit the ability to handle format changes.
+                                                                  <!-- }}} -->
+
+<chapt>Structure
+<!-- Data Stream                                                      {{{ -->
+<!-- ===================================================================== -->
+<sect>Data Stream
+<p>
+The data stream is encoded as a series of variable length numbers, fixed 
+length numbers and strings. The use of variable length number encoding
+was chosen to accomidate sites with over 100000 files, mostly below 16k,
+using variable length encoding will save approximately 400k of data and still
+allow some items that are very large.
+
+<p> 
+Numbers are coded as a series of bytes of non-fixed length, the highest bit
+of each byte is 1 if the next byte is part of this number. Bytes are ordered
+backwards from the least significant to the most significant inorder to 
+simplify decoding, any omitted bits can be assumed to be 0. Clients should
+decode into their largest type and fatally error if a number expands to
+larger than that. All numbers are positive.
+
+<p>
+Strings are coded in pascal form, with a length number preceeding a series
+of 8 bit characters making up the string. The strings are coded in UTF.
+
+<p>
+The first records in the file should be a header record followed by any
+include/exclude records to indicate how the list was generated. Following
+that is the actual file list data.
+
+<p>
+The records all have the same form, they start with an 8 bit tag value and
+then have the raw record data after. The main header has a set of flags for
+all of the records types, these flags are used to designate optional portions
+of the record. For instance a 
+file record may not have a md5 hash or uid/gid values, those would be marked 
+off in the flags. Generally every non-critical value is optional. The records 
+and their tags are as follows:
+
+<list>
+<item> 0 - Header
+<item> 1 - Directory Marker
+<item> 2 - Directory Start
+<item> 3 - Directory End
+<item> 4 - Normal File
+<item> 5 - Symlink
+<item> 6 - Device Special
+<item> 7 - Directory
+<item> 8 - Include/Exclude
+<item> 9 - User Map
+<item> 10 - Group Map
+<item> 11 - Hard Link
+<item> 12 - End Marker
+<item> 13 - RSync Checksums
+<item> 14 - Aggregate File
+<item> 15 - RSync End
+</list>
+
+<p>
+The header record is placed first in the file followed by Directory records
+and then by a number of file type records. The Directory Start/End are used 
+to indicate which directory the file records are in. The approach is to 
+create a bundle of file type records for each directory that are stored
+non-recursively. The directory marker records are used with depth-first
+traversal to create unseen directories with the proper permissions.
+                                                                 <!-- }}} -->
+<!-- Header                                                           {{{ -->
+<!-- ===================================================================== -->
+<sect>Header
+<p>
+The header is the first record in the file and contains some information about
+what will follow.
+<example>
+   struct Header
+   {
+      uint8 Tag;             // 0 for the header
+      
+      uint32 Signature;
+      uint16 MajorVersion;
+      uint16 MinorVersion;
+      number Epoch;
+
+      uint8 FlagCount;
+      uint32 Flags[12];
+   };
+</example>
+<taglist>
+<tag>Signature<item>
+This field should contain the hex value 0x97E78AB which designates the file
+as a DSync file list. Like all numbers it should be stored in network byte
+order.
+
+<tag>MajorVersion
+<tag>MinorVersion<item>
+These two fields designate the revision of the format. The major version
+should be increased if an incompatible change is made to the structure of
+the file, otherwise the minor version should reflect any changes. The current
+major/minor is 0 and 0. Compatibility issues are discussed later on.
+
+<tag>Epoch<item>
+Inorder to encode time in a single 32 bit signed integer the format uses a 
+shifting epoch. Epoch is set to a time in seconds from the unix
+epoch. All other times are relative to this time.
+In this way we can specify any date 68 years in either direction from any 
+possible time. Doing so allows us to encode time using only 32 bits. The
+generator should either error or truncate if a time value exceeds this 
+representation. This does impose the limitation that the difference between
+the lowest stored date and highest stored date must be no more than 136 years.
+
+<tag>FlagCount<item>
+This designates the number of items in the flag array.
+
+<tag>Flags<item>
+Each possible record type has a flag value that is used to indicate what
+items the generator emitted. There is no per-record flag in order to save
+space. The flag array is indexed by the record ID.
+
+</taglist>
+                                                                 <!-- }}} -->
+<!-- Directory Marker                                                 {{{ -->
+<!-- ===================================================================== -->
+<sect>Directory Marker, Directory Start and Directory
+<p>
+The purpose of the directory marker record is to specify directories that
+must be created before a directory start record can be processed. It is needed
+to ensure the correct permissions and ownership are generated while the
+contents are in transfer.
+
+<p>
+A Directory Start record serves to indicate a change of directory. All further
+file type records will refer to the named directory until a Directory End
+record is processed marking the final modification for this directory. It is
+not possible to nest directory start directives, in fact a Directory Start
+record implies a Directory End record for the previosly Started Directory
+
+<p>
+The plain directory record is a file type record that refers to a directory
+file type. All of these record types describe the same thing used in different
+contexts so share the same structure.
+
+<example>
+   struct DirMarker
+   {
+      uint8 Tag;             // 1, 2 or 7 for the header
+      
+      uint32 ModTime;
+      uint16 Permissions;
+      number User;
+      number Group;
+      string Path;
+   };
+</example>
+<taglist>
+<tag>Flags [from the header]<item>
+Optional portions of the structure are Permissions (1&lt;&lt;0) and user/group
+(1&lt;&lt;1). The bit is set to 1 if they are present.
+
+<tag>ModTime<item>
+This is the number of seconds since the file list epoch, it is the modification
+date of the directory.
+
+<tag>Permissions<item>
+This is the standard unix permissions in the usual format.
+
+<tag>User
+<tag>Group<item>
+These are the standard unix user/group for the directory. They are indirected
+through the user/group maps described later on.
+
+<tag>Path<item>
+The path from the base of the file list to the directory this record describes.
+However ordinary directory types have a single name relative to the last
+Directory Start record.
+</taglist>
+                                                                 <!-- }}} -->
+<!-- Directory End                                                    {{{ -->
+<!-- ===================================================================== -->
+<sect>Directory End
+<p>
+The purpose of the directory end marker is to signafy that their will be no 
+more file type records from this directory. Directory Start and Directory
+End records must be paired. The intent of this record is to allow future 
+expansion, NOT to allow recursive directory blocks. A Directory Start
+record will imply a Directory End record if the previous was not terminated.
+
+<p>
+There are no data members, it is the basic 1 item record. If the data stream
+terminates with an open directory block it is assumed to be truncated and
+an error issued.
+
+                                                                 <!-- }}} -->
+<!-- Normal File                                                      {{{ -->
+<!-- ===================================================================== -->
+<sect>Normal File
+<p>
+A normal file is a simple, regular file. It has the standard set of unix 
+attributes and an optional MD5 hash for integrity checking.
+<example>
+   struct NormalFile
+   {
+      uint8 Tag;             // 4
+      
+      uint32 ModTime;
+      uint16 Permissions;
+      number User;
+      number Group;
+      string Name;
+      number Size;
+      uint128 MD5;
+   };
+</example>
+<taglist>
+<tag>Flags [from the header]<item>
+Optional portions of the structure are Permissions (1&lt;&lt;0), user/group
+(1&lt;&lt;1), and MD5 (1&lt;&lt;2). The bit is set to 1 if they are present.
+
+<tag>ModTime<item>
+This is the number of seconds since the file list epoch, it is the modification
+date of the file.
+
+<tag>Permissions<item>
+This is the standard unix permissions in the usual format.
+
+<tag>User
+<tag>Group<item>
+These are the standard unix user/group for the directory. They are indirected
+through the user/group maps described later on. 
+
+<tag>Name<item>
+The name of the item. It should have no pathname components and is relative
+to the last Directory Start record.
+
+<tag>MD5<item>
+This is a MD5 hash of the file.
+
+<tag>Size<item>
+This is the size of the file in bytes.
+</taglist>
+                                                                 <!-- }}} -->
+<!-- Symlink                                                          {{{ -->
+<!-- ===================================================================== -->
+<sect>Symlink
+<p>
+This encodes a normal unix symbolic link. Symlinks do not have permissions
+or size, but do have optional ownership.
+<example>
+   struct Symlink
+   {
+      uint8 Tag;             // 5
+
+      uint32 ModTime;
+      number User;
+      number Group;
+      string Name;
+      uint8 Compression;
+      string To;
+   };
+</example>
+<taglist>
+<tag>Flags [from the header]<item>
+Optional portions of the structure are, user/group
+(1&lt;&lt;0). The bit is set to 1 if they are present.
+
+<tag>ModTime<item>
+This is the number of seconds since the file list epoch, it is the modification
+date of the file.
+
+<tag>User
+<tag>Group<item>
+These are the standard unix user/group for the directory. They are indirected
+through the user/group maps described later on. 
+
+<tag>Name<item>
+The name of the item. It should have no pathname components and is relative
+to the last Directory Start record.
+
+<tag>Compression<item>
+Common use of symlinks makes them very easy to compress, the compression
+byte allows this. It is an 8 bit byte with the first 7 bits representing an 
+unsigned number and the 8th bit as being a flag. The first 7 bits describe
+how many bytes of the last symlink should be prepended to To and if the 8th 
+bit is set then Name is appended to To.
+
+<tag>To<item>
+This is the file the symlink is pointing to. It is an absolute string taken
+as is. The client may perform checking on it if desired. The string is 
+compressed as described in the Compression field.
+</taglist>
+                                                                  <!-- }}} -->
+<!-- Device Special                                                   {{{ -->
+<!-- ===================================================================== -->
+<sect>Device Special
+<p>
+Device Special records encode unix device special files, which have a major
+and a minor number corrisponding to some OS specific attribute. These also
+encode fifo files, anything that can be created by mknod.
+<example>
+   struct DeviceSpecial
+   {
+      uint8 Tag;             // 6
+      
+      uint32 ModTime;
+      uint16 Permissions;
+      number User;
+      number Group;
+      number Dev;
+      string Name;
+   };
+</example>
+<taglist>
+<tag>Flags [from the header]<item>
+Optional portions of the structure areuser/group
+(1&lt;&lt;0). The bit is set to 1 if they are present.
+
+<tag>ModTime<item>
+This is the number of seconds since the file list epoch, it is the modification
+date of the file.
+
+<tag>Permissions<item>
+This non-optional field is used to encode the type of device and the 
+creation permissions.
+
+<tag>Dev<item>
+This is the OS specific 'dev_t' field for mknod.
+
+<tag>Major
+<tag>Minor<item>
+These are the OS dependent device numbers.
+
+<tag>Name<item>
+The name of the item. It should have no pathname components and is relative
+to the last Directory Start record.
+
+<tag>To<item>
+This is the file the symlink is pointing to.
+</taglist>
+                                                                  <!-- }}} -->
+<!-- Include/Exclude                                                  {{{ -->
+<!-- ===================================================================== -->
+<sect>Include and Exclude
+<p>
+The include/exclude list used to generate the file list is encoded after
+the header record. It is stored as an ordered set of include/exclude records
+acting as a filter. If no record matches then the pathname is assumed to 
+be included otherwise the first matching record decides.
+
+<example>
+   struct IncludeExclude
+   {
+      uint8 Tag;             // 8
+      
+      uint8 Type;
+      string Pattern;
+   };
+</example>
+<taglist>
+<tag>Flags [from the header]<item>
+None defined.
+
+<tag>Type<item>
+This is the sort of rule, presently 1 is an include rule and 2 is an exclude
+rule.
+
+<tag>Pattern<item>
+This is the textual pattern used for matching.
+
+</taglist>
+                                                                  <!-- }}} -->
+<!-- User/Group Map                                                   {{{ -->
+<!-- ===================================================================== -->
+<sect>User/Group Map
+<p>
+In order to properly transfer users and groups the names are converted from
+a local number into a file list number and a number to name mapping. When
+the remote side reads the file list it directs all UID/GID translations
+through the mapping to create the real names and then does a local lookup.
+This also provides some compressesion in the file list as large UIDs are
+converted into smaller values through the mapping.
+
+<p>
+The generator is expected to emit these records at any place before the IDs
+are actually used.
+<example>
+   struct NameMap
+   {
+      uint8 Tag;             // 9,10
+      
+      number FileID;
+      number RealID;
+      string Name;
+   };
+</example>
+<taglist>
+<tag>Flags [from the header]<item>
+Optional portions of the structure are RealID (1&lt;&lt;0).
+
+<tag>FileID<item>
+This is the ID used internally in the file list, it should be monotonically
+increasing each time a Map record is created so that it is small and unique.
+
+<tag>RealID<item>
+This is the ID used in the filesystem on the generating end. This information
+maybe used if the user selected to regenerate IDs without translation.
+</taglist>
+                                                                  <!-- }}} -->
+<!-- Hard Link                                                        {{{ -->
+<!-- ===================================================================== -->
+<sect>Hard Link
+<p>
+A hard link record is used to record a file that is participating in a hard
+link. The only information we know about the link is the inode and device 
+on the local machine, so we store this information. The client will have to
+reconstruct the linkages if possible.
+
+<example>
+   struct HardLink
+   {
+      uint8 Tag;             // 11
+      
+      uint32 ModTime;
+      number Serial;
+      uint16 Permissions;
+      number User;
+      number Group;
+      string Name;
+      number Size;
+      uint128 MD5;
+   };
+</example>
+<taglist>
+<tag>Flags [from the header]<item>
+Optional portions of the structure are Permissions (1&lt;&lt;0), user/group
+(1&lt;&lt;1), and MD5 (1&lt;&lt;2). The bit is set to 1 if they are present.
+
+<tag>ModTime<item>
+This is the number of seconds since the file list epoch, it is the modification
+date of the file.
+
+<tag>Serial<item>
+This is the unique ID number for the hardlink. It is composed from the
+device inode pair in a generator dependent way. The exact nature of the
+value is unimportant, only that two hard link records with the same serial
+should be linked together. It is recommended that the generator compress
+hard link serial numbers into small monotonically increasing IDs.
+
+<tag>Permissions<item>
+This is the standard unix permissions in the usual format.
+
+<tag>User
+<tag>Group<item>
+These are the standard unix user/group for the directory. They are indirected
+through the user/group maps described later on. 
+
+<tag>Name<item>
+The name of the item. It should have no pathname components and is relative
+to the last Directory Start record.
+
+<tag>MD5<item>
+This is a MD5 hash of the file.
+
+<tag>Size<item>
+This is the size of the file in bytes.
+</taglist>
+                                                                 <!-- }}} -->
+<!-- End Marker                                                               {{{ -->
+<!-- ===================================================================== -->
+<sect>End Marker
+<p>
+The End Marker is the final record in the stream, if it is missing the stream
+is assumed to be incomplete.
+<example>
+   struct Trailer
+   {
+      uint8 Tag;             // 12 for the header
+      
+      uint32 Signature;
+   };
+</example>
+<taglist>
+<tag>Signature<item>
+This field should contain the hex value 0xBA87E79 which is designed to 
+prevent a correputed stream as begin a legitimate end marker.
+</taglist>
+                                                                 <!-- }}} -->
+
+<!-- RSync Checksums                                                  {{{ -->
+<!-- ===================================================================== -->
+<sect>RSync Checksums
+<p>
+The checksum record contains the list of checksums for a file and represents
+the start of a RSync description block which may contain RSync Checksums,
+a Normal File entry or Aggregate Files records.
+<example>
+   struct RSyncChecksums
+   {
+      uint8 Tag;             // 13
+      
+      number BlockSize;
+      number FileSize;
+      uint160 Sums[ceil(FileSize/BlockSize)];
+   };
+</example>
+<taglist>
+<tag>BlockSize<item>
+The size of each block in the stream in bytes.
+
+<tag>FileSize<item>
+The total size of the the file in bytes.
+
+<tag>Sums<item>
+The actual checksum data. The format has the lower 32 bytes as the weak 
+checksum and the upper 128 as the strong checksum.
+</taglist>
+                                                                 <!-- }}} -->
+<!-- Aggregate File                                                   {{{ -->
+<!-- ===================================================================== -->
+<sect>Aggregate File
+<p>
+If the generator was given a list of included files this record will be 
+emitted after the rsync checksum record, once for each file. The given 
+paths are files that are likely to contain fragments of the larger file.
+<example>
+   struct AggregateFile
+   {
+      uint8 Tag;             // 14 for this record
+      
+      string File;
+   };
+</example>
+<taglist>
+<tag>File<item>
+The stored filename.
+</taglist>
+                                                                 <!-- }}} -->
+<!-- RSync End                                                        {{{ -->
+<!-- ===================================================================== -->
+<sect>RSync End
+<p>
+The purpose of the directory end marker is to signafy that the RSync data
+is finished. RSync blocks begin with the RSync checksum record, then are
+typically followed by a Normal File record describing the name and attributes
+of the file and then optionally followed by a set of Aggregate File records.
+
+<p>
+There are no data members, it is the basic 1 item record. If the data stream
+terminates with an open block it is assumed to be truncated and an error 
+issued.
+                                                                 <!-- }}} -->
+
+<chapt>The Client
+<!-- Handling Compatibility                                            {{{ -->
+<!-- ===================================================================== -->
+<sect>Handling Compatibility
+<p>
+The format has no provision for making backwards compatible changes, even
+minor ones. What was provided is a way to make a generator that is both
+forwards and backwards compatible with clients, this is done by disabling
+generation of unsupported items and masking them off in the flags.
+
+<p>
+To deal with this a client should examine the header and determine if it has
+a suitable major version, the minor version should largely be ignored. The
+client should then examine the flags values and for all records it understands
+ensure that no bits are masked on that it does not understand. Records that
+it cannot handle should be ignored at this point. When the client is
+parsing it should abort if it hits a record it does not support.
+                                                                 <!-- }}} -->
+<!-- Client Requirements                                              {{{ -->
+<!-- ===================================================================== -->
+<sect>Client Requirements
+<p>
+The client attempting to verify syncronisity of a local file tree and a
+tree destribed in a file list must do three things, look for extra local files,
+manage the UID/GID mappings and maintain a map of hardlinks. These items 
+corrispond to the only necessary memory usage on the client.
+
+<p>
+It is expected that the client will use the timestamp, size and possibly
+MD5 hash to match the local file against the remote one to decide if it 
+should be retrieved.
+
+<p>
+Hardlinks are difficult to handle, but represent a very usefull feature. The
+client should track all hard links until they are associated with a local
+file+inode, then all future links to that remote inode can be recreated 
+locally.
+                                                                 <!-- }}} -->
+
+<chapt>RSync Method
+<!-- Overview                                                          {{{ -->
+<!-- ===================================================================== -->
+<sect>Overview
+<p>
+The <em>rsync method</> was invented by Andrew Tridgell and originally 
+implemented in the rsync program. DSync has a provision to make use of the 
+<em>rsync method</> for transfering differences between files effeciently,
+however the implemention is not as bandwidth efficient as what the rsync 
+program uses, emphasis is placed on generator efficiency. 
+
+<p>
+Primarily the <em>rsync method</> makes use of a series of weak and strong
+block checksums for each block in a file. Blocks are a uniform size and
+are uniformly distributed about the source file. In order to minimize server
+loading the checksum data is generated for the file on the server and then 
+sent to the client - this might optionally be done from a cached file. The
+client is responsible for performing the checksumming and searching on its
+end.
+
+<p>
+In contrast rsync has the client send its checksums to the server and the
+server sends back commands to reconstruct the file. This is more bandwidth
+efficient because only one round trip is required and there is a higher chance
+that more blocks will be matched and not need to be sent to the client.
+
+<p>
+Furthermore a feature designed for use by CD images is provided where a file
+can be specified as the aggregation of many smaller files. The aggregated
+files are specified only by giving the file name. The client is expected to 
+read the file (probably from the network) and perform checksum searching 
+against the provided table.
+
+                                                                 <!-- }}} -->
+<!-- CD Images                                                         {{{ -->
+<!-- ===================================================================== -->
+<sect>CD Images
+<p>
+The primary and most complex use of the rsync data is for forming CD images
+on the fly from a mirror and a CD source. This is extremly usefull beacause 
+CD images take up alot of space and bandwidth to mirror, while they are 
+mearly aggregates of (possibly) already mirrored data. Using checksums
+and a file listing allows the CD image to be reconstructed from any mirror
+and reduces the loading on primary CD image servers.
+
+<p>
+The next use of checksums is to 'freshen' a CD image during development. If
+a image is already present that contains a subset of the required data the 
+checksums generally allow a large percentage of that data to be reused.
+
+<p>
+Since the client is responsible for reconstruction and checksum searching it
+is possible to perform in place reconstruction and in place initial generation
+that does not require a (large!) temporary file.
+                                                                 <!-- }}} -->
+
+
+</book>