Hey so I am new to Python and I wanted to make a script that retrieves the file name from a list of docx documents in a large directory if a file contains a certain word inside the word document.
Here is my code below so far
import os import docx2txt os.chdir('C:/Users/epicr/Desktop/Python Stuff/LAB FILES') text= '' files = [] for file in os.listdir('C:/Users/epicr/Desktop/Python Stuff/LAB FILES'): if file.endswith('.docx'): files.append(file) for i in range(len(files)): text += docx2txt.process(files[i]) if text == str('VENTILATION RATIO'): print (i)
My thought process is to convert all these docx documents to txt files then search the files for the word that contains ‘VENTILATION RATIO’. If the word exists in the files, then the file name containing the file will print.
However the output doesn’t print out anything. I know for a fact that in at least one of the Word Documents, there is a word: ‘VENTILATION RATIO’ (and yes, it is case sensitive) in it
Advertisement
Answer
There may be a logic issue in your code.
Try this update:
import os import docx2txt os.chdir('C:/Users/epicr/Desktop/Python Stuff/LAB FILES') text= '' files = [] for file in os.listdir('C:/Users/epicr/Desktop/Python Stuff/LAB FILES'): if file.endswith('.docx'): files.append(file) for i in range(len(files)): text = docx2txt.process(files[i]) # text for single file if 'VENTILATION RATIO' in text: print (i, files[i]) # file index and name