Skip to content
Advertisement

Do I understand os.walk right?

The loop for root, dir, file in os.walk(startdir) works through these steps?

for root in os.walk(startdir) 
    for dir in root 
        for files in dir
  1. get root of start dir : C:dir1dir2startdir

  2. get folders in C:dir1dir2startdir and return list of folders “dirlist”

  3. get files in the first dirlist item and return the list of files “filelist” as the first item of a list of filelists.

  4. move to the second item in dirlist and return the list of files in this folder “filelist2” as the second item of a list of filelists. etc.

  5. move to the next root in the folder tree and start from 2. etc.

Right? Or does it just get all roots first, then all dirs second, and all files third?

Advertisement

Answer

os.walk returns a generator, that creates a tuple of values (current_path, directories in current_path, files in current_path).

Every time the generator is called it will follow each directory recursively until no further sub-directories are available from the initial directory that walk was called upon.

As such,

os.walk('C:dir1dir2startdir').next()[0] # returns 'C:dir1dir2startdir'
os.walk('C:dir1dir2startdir').next()[1] # returns all the dirs in 'C:dir1dir2startdir'
os.walk('C:dir1dir2startdir').next()[2] # returns all the files in 'C:dir1dir2startdir'

So

import os.path
....
for path, directories, files in os.walk('C:dir1dir2startdir'):
     if file in files:
          print('found %s' % os.path.join(path, file))

or this

def search_file(directory = None, file = None):
    assert os.path.isdir(directory)
    for cur_path, directories, files in os.walk(directory):
        if file in files:
            return os.path.join(directory, cur_path, file)
    return None

or if you want to look for file you can do this:

import os
def search_file(directory = None, file = None):
    assert os.path.isdir(directory)
    current_path, directories, files = os.walk(directory).next()
    if file in files:
        return os.path.join(directory, file)
    elif directories == '':
        return None
    else:
        for new_directory in directories:
            result = search_file(directory = os.path.join(directory, new_directory), file = file)
            if result:
                return result
        return None
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement