Skip to content
Advertisement

How do I separate this dataframe column by month?

A few rows of my dataframe

The third column shows the time of completion of my data. Ideally, I’d want the second row to just show the date, removing the second half of the elements, but I’m not sure how to change the elements. I was able to change the (second) column of strings into a column of floats without the pound symbol in order to find the sum of costs. However, this column has no specific keyword I just select for all of the elements to remove.

Second part of my question is is it is possible to easy create another dataframe that contains 2021-05-xx or 2021-06-xx. I know there’s a way to make another dataframe selecting certain rows like the top 15 or bottom 7. But I don’t know if there’s a way to make a dataframe finding what I mentioned. I’m thinking it follows the Series.str.contains(), but it seems like when I put ‘2021-05’ in the (), it shows a entire dataframe of False’s.

Advertisement

Answer

Extracting just the date and ignoring the time from the datetime column can be done by changing the formatting of the column.

df['date'] = pd.to_datetime(df['date']).dt.date

To the second part of the question about creating a new dataframe that is filtered down to only contain rows between 2021-05-xx and 2021-06-xx, we can use pandas filtering.

df_filtered = df[(df['date'] >= pd.to_datetime('2021-05-01')) & (df['date'] <= pd.to_datetime('2021-06-30'))]

Here we take advantage of two things: 1) Pandas making it easy to compare the chronology of different dates using numeric operators. 2) Us knowing that any date that contains 2021-05-xx or 2021-06-xx must come on/after the first day of May and on/before the last day of June.

There are also a few GUI’s that make it easy to change the formatting of columns and to filter data without actually having to write the code yourself. I’m the creator of one of these tools, Mito. To filter dates in Mito, you can just enter the dates using our calendar input fields and Mito will generate the equivalent pandas code for you!

Advertisement