I have files with a lot of weird formatting and I’m trying to create a function that removes any formatting from an xlsx file.
Some guys in here suggested to use “cell.fill = PatternFill(fill_type=None)” to clear any format from a given cell.
path = r'C:DesktopPythonOpenpyxlFormatted.xlsx wb = xl.load_workbook(filename = path) def removeFormatting(file): ws = wb[file] for row in ws.iter_rows(): for cell in row: if cell.value is None: cell.fill = PatternFill(fill_type=None) wb.save(path) for s in wb.sheetnames: removeFormatting(s)
But this won’t change anything. If the cells are empty but colored, then openpyxl still sees them as non empty.
Following this post: Openpyxl check for empty cell
The problem with ws.max_column and ws.max_row is that it will count blank columns as well, thus defeating the purpose.”
@bhaskar was right. When I’m trying to get the max column, I get for all the sheets, the same value as from the first sheet.
col = [] for sheet in wb.worksheets: col.append(sheet.max_column)
So even if there are different sheet dimensions, if the cell has a background color or any other formatting, it will take it as valid non empty cell.
Does anyone know how to solve this?
Thanks in advance!
Advertisement
Answer
This function removes styles from all cells in a given worksheet (passed as ws
object).
So you can open your file, iterate over all worksheets and apply this function to each one:
def removeFormatting(ws): # ws is not the worksheet name, but the worksheet object for row in ws.iter_rows(): for cell in row: cell.style = 'Normal'
If you also want to check info about how to define and apply named styles, take a look here: https://openpyxl.readthedocs.io/en/stable/styles.html#cell-styles-and-named-styles