Skip to content
Advertisement

find days between 2 dates in python but only number

I was trying to find difference of a series of dates and a date. for example, the series is from may1 to june1 which is

date = pd.DataFrame()

In [0]: date['test'] = pd.date_range("2021-05-01", "2021-06-01", freq = "D")

Out[0]: date
    test
0   2021-05-01 00:00:00
1   2021-05-02 00:00:00
2   2021-05-03 00:00:00
3   2021-05-04 00:00:00
4   2021-05-05 00:00:00
5   2021-05-06 00:00:00
6   2021-05-07 00:00:00
7   2021-05-08 00:00:00
8   2021-05-09 00:00:00
9   2021-05-10 00:00:00

In[1]
date['test'] = date['test'].dt.date

Out[1]:
    test
0   2021-05-01
1   2021-05-02
2   2021-05-03
3   2021-05-04
4   2021-05-05
5   2021-05-06
6   2021-05-07
7   2021-05-08
8   2021-05-09
9   2021-05-10

In[2]:date['base'] = dt.strptime("2021-05-01",'%Y-%m-%d')

Out[2]:
0   2021-05-01 00:00:00
1   2021-05-01 00:00:00
2   2021-05-01 00:00:00
3   2021-05-01 00:00:00
4   2021-05-01 00:00:00
5   2021-05-01 00:00:00
6   2021-05-01 00:00:00
7   2021-05-01 00:00:00
8   2021-05-01 00:00:00
9   2021-05-01 00:00:00

In[3]:date['base'] = date['base'].dt.date

Out[3]:
    base
0   2021-05-01
1   2021-05-01
2   2021-05-01
3   2021-05-01
4   2021-05-01
5   2021-05-01
6   2021-05-01
7   2021-05-01
8   2021-05-01
9   2021-05-01

In[4]:date['test']-date['base']

Out[4]: 
    diff
0   0 days 00:00:00.000000000
1   1 days 00:00:00.000000000
2   2 days 00:00:00.000000000
3   3 days 00:00:00.000000000
4   4 days 00:00:00.000000000
5   5 days 00:00:00.000000000
6   6 days 00:00:00.000000000
7   7 days 00:00:00.000000000
8   8 days 00:00:00.000000000
9   9 days 00:00:00.000000000
10  10 days 00:00:00.000000000

the only thing i could get is this. I don’t want anything other than the number 1-10 cuz i need them for further numerical calculation but i can’t get rid of those. Also how could i construct a time series which just outputs the date not the hms after it? i don’t want to manually .dt.date for all of those and it sometimes mess things up

Advertisement

Answer

You don’t need to create a column base for this, simply do:

>>> (date['test'] - pd.to_datetime("2021-05-01", format='%Y-%m-%d')).dt.days
0      0
1      1
2      2
3      3
4      4
...
27    27
28    28
29    29
30    30
31    31
Name: test, dtype: int64
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement