Bug with binary file backup.

Dan Sturtevant dsturtev@plogic.com
Thu, 7 Feb 2002 14:05:45 -0500 (EST)


Ben,

Thanks for the advice.

4. This was a nogo.  rdiff's output (from the diff of the 2 tarballs) was
~650Megs.  Each of the distro's was approximately the same size.  I assume
that this was because of file offsets within the tar file and because
lots of binary info was present.  The algorithm just didnt work for this
case.

3. I tried rsync 2.5.2.  I downloaded this last night.  Although It
presents a pretty interface, It is still broken.  The delta files
generated were about the same size as the directories you create (~120Megs
- 150Megs) but failed to apply cleanly in many different cases with
different distributions.

1. With rdiff-backup I had no problem with the exception of files with
identical times, sizes, names, etc.  This problem was worked around by
adding a touch to my scripts.  I will be using this system.  I plan on
rpming it for our needs and writing some utility scripts.

Let me know if you want any testing, benchmarking numbers, rpms etc.


Question - Why is python 2.2 required?  I haven't tried it with Python
2.1, but It would make my life easier if i could just rely on the Python
2.1 rpm.  What new functionality do you use?

Dan Sturtevant





On Wed, 6 Feb 2002, Ben Escoto wrote:

> >>>>> "DS" == Dan Sturtevant <dsturtev@plogic.com>
> >>>>> wrote the following on Wed, 6 Feb 2002 18:33:26 -0500 (EST)
>
>   DS> What I was doing was maintaining a tree of the original
>   DS> distribution (Plogic-7.1-5) that I was working with and then
>   DS> creating deltas by running #rdiff-backup Plogic-7.1-6/ backup/
>   DS> followed by #rdiff-backup Plogic-7.1-5 backup/
>
>   DS> I had hoped that this would create a directory structure that
>   DS> the end user could place into a copy of the original
>   DS> Plogic-7.1-5 tree to get it up to Plogic-7.1-6.
>
>   DS> This creates the opposite problem:  I want older timestamps from
>   DS> Plogic-7.1-5 to be pushed on top of Plogic-7.1-6 to generate the
>   DS> rdiff-backup-data directory.  It will almost always be the case
>   DS> that older files should replace newer ones in my case.
>
> Ok, I can think of 4 things that you could try:
>
> 1.  Use my program.  The problem you ran into was that files seemed
>     identical because:
>
>   DS> My test directories had the same time because they were coppied
>   DS> over to my local machine at the same time.
>
>     So, it seems if you waited a second (e.g. cp ...; sleep 1; cp
>     ....)  your immediate problem would go away.
>
> 2.  Instead of using CVS use something like PRCS (?) which is better
>     at taking binary differences.
>
> 3.  I seem to remember someone contributing an extension to rsync
>     called rsync+.  I don't know if it has been adopted yet, or how
>     stable it is, but the idea is to save the information between
>     rsync sessions.  For instance, suppose server 1 always changes,
>     and servers 2 and 3 are supposed to be identical copies of 1.  I
>     think rsync+ could be run on 1 and 2, and then would produce a
>     file that could be transmitted to 3, so 3 could be updated without
>     rereading all the files on 1 again.
>
> 4.  Make a big tar (uncompressed; just tar) file like Plogic-7.1-5.tar
>     or Plogic-7.1-6.tar and then run rdiff on those two files.  You
>     can compress the diff and send it around.
>
> I don't know your situation very well, but I would try (4) first, then
> (3) if it is stable and easy to use, then (1) and then (2).
>
>   DS> You seem to have a great deal of experience in this domain.
>           ^^^^
> Yep, on the internet no one knows that you're a dog.  :-)
>
>
> --
> Ben Escoto
>