Tracking how many elements processed in generator

Question

I have a problem in which I process documents from files using python generators. The number of files I need to process are not known in advance. Each file contain records which consumes considerable amount of memory. Due to that, generators are used to process records. Here is the summary of the code I am working on: My process_records function

Accepted Answer

You could make your generator a class with an attribute that contains a count of the number of records it has processed. Something like this:class RecordProcessor(object):    def __init__(self, recs):        self.recs = recs        self.processed_rec_count = 0    def __call__(self):        for r in self.recs:            if r.sender('sender_of_interest'):               self.processed_rec_count += 1               # process record r...               yield r  # processed recorddef process_all_records(files):    for f in files:        fd = open(f,'r')        recs_p = RecordProcessor(read_records(fd))        write_records(recs_p)        print 'records processed:', recs_p.processed_rec_count

Advertisement

Answer