Skip to content
Advertisement

Majority of my column headers are dates in my dataframe, not able to use the loc function – how do I fix this?

I have a dataframe that shows the number of downloads for each show, where every month is a column, with the actual start of each month being the data column name.

import pandas as pd
df = pd.read_excel(r'C:/Users/TestUser/Documents/Folder/Test.xlsx', sheet_name='Downloads', header=2)
df

df looks like this below:

Show 2017-08-01 00:00:00 2017-09-01 00:00:00 2017-10-01 00:00:00
Show 1 23004 50320 450320
Show 2 30418 74021 92103

However, when I try to access a column using the loc function, I run into an error:

df.loc[:, 2017-08-01 00:00:00]

File “”, line 1 df.loc[:, 2017-07-01 00:00:00] ^ SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers

When I put single quotes before the date, I get another error: KeyError: ‘2017-07-01 00:00:00’

The data type for the date column headers are float64, if it helps.

Advertisement

Answer

Convert headers to string

df.columns = df.columns.astype(str)

or

df.columns = df.columns.map(str)

then, if you want to access all the column

df['2017-08-01 00:00:00']

or if you want to access for example cell 5 of this column

df.at[4, '2017-08-01 00:00:00']
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement