Loading whole filelist into memory
Donovan Baarda
abo@minkirri.apana.org.au
Sat, 30 Mar 2002 17:30:56 +1100
On Fri, Mar 29, 2002 at 12:04:01PM -0800, Ben Escoto wrote:
> >>>>> "DB" == Donovan Baarda <abo@minkirri.apana.org.au>
> >>>>> wrote the following on Fri, 29 Mar 2002 21:39:30 +1100
>
> >> Also, --exclude-from-filelist wouldn't make much sense if the
> >> entire filelist couldn't be read first.
>
> DB> not entirely true... a smart scanner can exclude files and skip
> DB> whole directories as it scans.
>
> Well, suppose there is an exclude filelist. rdiff-backup wants to
> start backing up, so it begins with file, say, /bin/ls. Should it
> back it up, or it is somewhere in the exclude list? Unless we require
> the exclude list to be sorted or something like that we can't process
> a single file until the whole list is read.
Ahh. A miss-understanding.
I thought you meant reading the whole directory tree filelist, not the whole
include/exclude list. I can't really see any way of avoiding reading the
whole include/exclude list into memory.
> DB> Currently my "dirscan.py" module builds and returns a big python
> DB> list of all matching files. This was so you could do things
> DB> like;
>
> DB> for file in scan(startdir,selectlist): do something...
>
> DB> I'm thinking of changing/extending this so that it can be used
> DB> to process files as they are scanned. The simplest approach
> DB> would be to introduce an os.walk() style command that applies a
> DB> function to each matching file as it finds them. A probably
> DB> better way would be for me to delve into how things like xrange
> DB> work to see if I could implement something like it.
>
> They are called generators and are a great new feature of python 2.2.
> So you can use the exact same:
>
> for file in scan(startdir,selectlist):
> do something...
>
> but have scan(..) yield objects as they are requested by the for loop.
I've thus far been avoiding 2.2 features. I thought that since 2.1 had
xrange, there might be a way to make it do the same thing... but maybe not.
I thought maybe you could do something wierd with a UserList that builds the
elements as they are referenced...
--
----------------------------------------------------------------------
ABO: finger abo@minkirri.apana.org.au for more info, including pgp key
----------------------------------------------------------------------