xdelta vs. rdiff
Dan Sturtevant
dsturtev@plogic.com
Mon, 11 Feb 2002 14:40:28 -0500 (EST)
Ok, for more realistic numbers. My patch file between the different
distributions Is now
On Mon, 11 Feb 2002, Dan Sturtevant wrote:
>
> Ben, I used xdelta to create a diff of the distributions.
> There were approx 50 rpms that were different between the two distros.
>
> distro 1: 610 Megs
> distro 2: 623 Megs
>
> Delta file generated by rdiff on the two tarballs was ~650 Megs.
>
> Delta directory generated by rdiff-backup ~150 Megs.
>
> Delta directory created by rsync+ ~150 Megs. (although this system is
> still beta and very broken.)
>
> Al the above systems were based on the rdiff algorithm. The reason the
> rdiff-backup and rsync+ got down to 150 Megs is because they traverse
> directories and make deltas against individual files. The compression
> within each file is still based upon the inefficient rdiff algorithm.
>
>
> Here is the impressive part.
>
> running:
> xdelta delta tar1.tar tar2.tar tar.patch
> produced a patch file of 87K
>
> I couldn't believe it.
>
> I moved tar1.tar and tar.patch to a different directory and ran:
> xdelta patch tar.patch tar1.tar tar2-2.tar
>
> I then ran
> diff tar2.tar tar2-2.tar.
>
> No difference.
>
> xdelta is very computationally intensive. I dont have any hard numbers
> thus far, but My system running a 2.4.3-12 redhat kernel with 512 Megs of
> memory was swapping.
>
> Needless to say, I reccomend looking into using this system in any case
> where binary data represents the majority of the data you are trying to
> compress.
>
> Thanks,
> Dan
>
>
> On Thu, 7 Feb 2002, Ben Escoto wrote:
>
> > >>>>> "DS" == Dan Sturtevant <dsturtev@plogic.com>
> > >>>>> wrote the following on Thu, 7 Feb 2002 14:05:45 -0500 (EST)
> >
> > DS> 4. This was a nogo. rdiff's output (from the diff of the 2
> > DS> tarballs) was ~650Megs. Each of the distro's was approximately
> > DS> the same size. I assume that this was because of file offsets
> > DS> within the tar file and because lots of binary info was present.
> > DS> The algorithm just didnt work for this case.
> >
> > This is a bit disappointing... It seems rdiff isn't as good at
> > finding binary similarities as I thought. Just for my curiousity
> > though, if you still have the tarballs around, could you try the same
> > thing with xdelta v1.x.x? You can find RPMs of it with rpmfind. I'm
> > wondering if it is superior to rdiff for this kind of thing.
>
> _______________________________________________
> Rdiff-backup mailing list
> Rdiff-backup@keywest.Stanford.EDU
> http://keywest.Stanford.EDU/mailman/listinfo/rdiff-backup
>