The loop for root, dir, file in os.walk(startdir)
works through these steps?
for root in os.walk(startdir) for dir in root for files in dir
get root of start dir : C:dir1dir2startdir
get folders in C:dir1dir2startdir and return list of folders “dirlist”
get files in the first dirlist item and return the list of files “filelist” as the first item of a list of filelists.
move to the second item in dirlist and return the list of files in this folder “filelist2” as the second item of a list of filelists. etc.
move to the next root in the folder tree and start from 2. etc.
Right? Or does it just get all roots first, then all dirs second, and all files third?
Advertisement
Answer
os.walk
returns a generator, that creates a tuple of values (current_path, directories in current_path, files in current_path).
Every time the generator is called it will follow each directory recursively until no further sub-directories are available from the initial directory that walk was called upon.
As such,
os.walk('C:dir1dir2startdir').next()[0] # returns 'C:dir1dir2startdir' os.walk('C:dir1dir2startdir').next()[1] # returns all the dirs in 'C:dir1dir2startdir' os.walk('C:dir1dir2startdir').next()[2] # returns all the files in 'C:dir1dir2startdir'
So
import os.path .... for path, directories, files in os.walk('C:dir1dir2startdir'): if file in files: print('found %s' % os.path.join(path, file))
or this
def search_file(directory = None, file = None): assert os.path.isdir(directory) for cur_path, directories, files in os.walk(directory): if file in files: return os.path.join(directory, cur_path, file) return None
or if you want to look for file you can do this:
import os def search_file(directory = None, file = None): assert os.path.isdir(directory) current_path, directories, files = os.walk(directory).next() if file in files: return os.path.join(directory, file) elif directories == '': return None else: for new_directory in directories: result = search_file(directory = os.path.join(directory, new_directory), file = file) if result: return result return None