Currently I’m reading all excel files and appending the “data” sheet to an array from a given directory. The issue is that some of the excel files in the directory shouldn’t actually be there and don’t contain a “data” sheet. The program then breaks whenever it encounters such a case. My workaround is printing the name of the file before it’s added and then manually deleting the file and rerunning the script. I tried creating a Sheet_check function as shown further below, although this significantly slowed down the run time and also didn’t work as intended.
Q: If there is a simple/efficient method to only add an excel file given that a specific sheet exists
JavaScript
x
13
13
1
read_files = glob.glob(os.path.join(file_path, "*.xlsx"))
2
3
file_array = []
4
for files in read_files:
5
#Remove_password_xlsx(files, password)
6
7
#if Sheet_check(files) == False:
8
#read_files.remove(files)
9
10
print(files)
11
data = pd.read_excel(files, 'Data')
12
file_array.append(data)
13
Doesn’t work:
JavaScript
1
10
10
1
def Sheet_check(filename):
2
bAvailable = False
3
book = load_workbook(filename)
4
sheets = book.sheetnames
5
for sheet in sheets:
6
if sheet.upper() == "DATA":
7
bAvailable = True
8
break
9
return bAvailable
10
Advertisement
Answer
Using exceptions should work:
JavaScript
1
17
17
1
from xlrd import XLRDError
2
read_files = glob.glob(os.path.join(file_path, "*.xlsx"))
3
4
file_array = []
5
for files in read_files:
6
#Remove_password_xlsx(files, password)
7
8
#if Sheet_check(files) == False:
9
#read_files.remove(files)
10
11
print(files)
12
try:
13
data = pd.read_excel(files, sheet_name='Data')
14
file_array.append(data)
15
except XLRDError:
16
print('No "Data" sheet')
17