Another idea

Tobias Polzin polzin_spamprotect_@gmx.de
Wed, 28 Aug 2002 19:54:35 +0200


>
>   TP> b) do not use an additional mirror, but use rsync --dry-run to
>   TP> find out, which files are new and which are deleted.  Copy the
>   TP> new files to some directory NEWFILES.  But what then? Is there a
>   TP> possibility to update only a certain list of files (I tried
>   TP> --include, but this also removes all files that are not in the
>   TP> list). I am looking for something like the inversion of
>   TP> --restrict, a --allow-from-stdin.  This would be cool.
> 
> Sorry, currently rdiff-backup doesn't do what you describe.  But even
> if it did, how would you copy the files to the NEWFILES directory?
> And how would you tell rdiff-backup which directory those files were
> in and which files were hardlinked to which files?

rsync is already installed on all systems and a 1.5 version of python,
too. 
I wanted to use "rsync --dry-run" to extract filenames and than a
rsync --include-from=NEWFILELIST --exclude="*" local BACKUPSERVER:NEWFILES
that preserves the directory structure. 

Hm, but preserving hard links is not possible through this procedure.

But:

Using the idea from www.mikerubel.org I could use hard-links:

1) BACKUPSERVER:  
cp -al backup/ mirror
rm -rf mirror/rdiff-backup-data
2) LOCAL:
rsync -a -H --delete LOCALFILES BACKUPSERVER:mirror
3) BACKUPSERVER:
rdiff-backup mirror/ backup

I tried it a little and it worked (even with copying hard-links).

Do you see any problems with this?

>   TP> If that is not possible, I have to copy the current rdiff-backup
>   TP> to a new file, add the files from NEWFILES, remove those files
>   TP> that where deleted and than start rdiff-backup.  But this is a
>   TP> waste of cpu-time...
> 
> I guess you could do this, but wouldn't this take up as much space (at
> least temporarily) as the rsync solution?  You might as well copy the
> rdiff-backup destination directory (minus the rdiff-backup-data
> directory) and then use rsync on that.

The system will hold backups of different users, if not all users decide
to perform a backup the same time, a temporary increase is not the
problem (and one could use lock-files to take care of simultaneous
access). But, if it works the above solution would be much better, of
course.

>   TP> By the way: Do you have experience how fast the
>   TP> rdiff-backup-data directory for the diff-files grows?
> 
> Well, I only have much experience with my own backup set (root of
> personal linux system).  Here are statistics for a typical increment:
> 
> ElapsedTime 2051.54 (34 minutes 11.54 seconds)
> SourceFiles 464149
> SourceFileSize 9385011476 (8.74 GB)
[..]
> IncrementFiles 970
> IncrementFileSize 18290106 (17.4 MB)

Thanks for your answer.
It would be interesting to know, for how long you backup/keep your 
diff files.

Tobias