The file contains data by date and time: All I want I want drop rows that contains between these dates and includes the start and end dates:
data_file =pd.read_csv(r"MyFile.csv", header = None) start_date = '01/08/2017' end_date = '29/8/2017' my_dataframe = my_dataframe.drop([start_date : end_date]) data_file = data_file.to_csv('summary.csv', index = False, header = False)
Any Idea?
Advertisement
Answer
Sample:
rng = pd.date_range('2017-07-02', periods=10, freq='10D') df = pd.DataFrame({'Date': rng, 'a': range(10)}) print (df) Date a 0 2017-07-02 0 1 2017-07-12 1 2 2017-07-22 2 3 2017-08-01 3 4 2017-08-11 4 5 2017-08-21 5 6 2017-08-31 6 7 2017-09-10 7 8 2017-09-20 8 9 2017-09-30 9
Use boolean indexing
for filter by condition with chain by |
for bitwise OR:
start_date = '2017-08-01' end_date = '2017-08-29' df1 = df[(df['Date'] < start_date) | (df['Date'] > end_date)] print (df1) Date a 0 2017-07-02 0 1 2017-07-12 1 2 2017-07-22 2 6 2017-08-31 6 7 2017-09-10 7 8 2017-09-20 8 9 2017-09-30 9
Or filter by Series.between
and invert mask by ~
:
df1 = df[~df['Date'].between(start_date ,end_date)] print (df1) Date a 0 2017-07-02 0 1 2017-07-12 1 2 2017-07-22 2 6 2017-08-31 6 7 2017-09-10 7 8 2017-09-20 8 9 2017-09-30 9