Loading whole filelist into memory

Ben Escoto bescoto@stanford.edu
Thu, 28 Mar 2002 00:12:02 -0800


--==_Exmh_1875534256P
Content-Type: text/plain; charset=us-ascii


Another thing I wanted feedback on was how to process filelists.  For
technical reasons it would be a lot easier for me to read the whole
filelist into memory and sort it, instead of, say, reading a line,
backing up that file, reading the next line, etc.  The main
differences would be:

1.  Applications couldn't generate filelists that depended on
    rdiff-backup already having processed an earlier file in the
    list.  This is a stretch; I don't think it would be an issue
    in real life.

2.  More importantly, large filelists may not fit into memory easily.
    For instance, rsync builds whole filelists ahead of time, and for
    that reason often consumes hundreds of megabytes of memory.  Say
    each entry in a file list takes up 60 bytes.  If the filelist
    contained 10 million files, that's 600 MB.

3.  I guess it could take a long time to sort long filelists, but
    probably for the most part they'd come pre-sorted and sorting 10
    million entries wouldn't take long anyway (?).

Also, --exclude-from-filelist wouldn't make much sense if the entire
filelist couldn't be read first.


--
Ben Escoto

--==_Exmh_1875534256P
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Exmh version 2.5 01/15/2001

iD8DBQE8otBQ+owuOvknOnURAiD0AJ990AkRrBWuy17Rg9eW580Zg3WMJgCgjdzS
kMjpVmA3jXmZtXoO2oor1W4=
=qGOo
-----END PGP SIGNATURE-----

--==_Exmh_1875534256P--