I was trying to find difference of a series of dates and a date. for example, the series is from may1 to june1 which is
JavaScript
x
79
79
1
date = pd.DataFrame()
2
3
In [0]: date['test'] = pd.date_range("2021-05-01", "2021-06-01", freq = "D")
4
5
Out[0]: date
6
test
7
0 2021-05-01 00:00:00
8
1 2021-05-02 00:00:00
9
2 2021-05-03 00:00:00
10
3 2021-05-04 00:00:00
11
4 2021-05-05 00:00:00
12
5 2021-05-06 00:00:00
13
6 2021-05-07 00:00:00
14
7 2021-05-08 00:00:00
15
8 2021-05-09 00:00:00
16
9 2021-05-10 00:00:00
17
18
In[1]
19
date['test'] = date['test'].dt.date
20
21
Out[1]:
22
test
23
0 2021-05-01
24
1 2021-05-02
25
2 2021-05-03
26
3 2021-05-04
27
4 2021-05-05
28
5 2021-05-06
29
6 2021-05-07
30
7 2021-05-08
31
8 2021-05-09
32
9 2021-05-10
33
34
In[2]:date['base'] = dt.strptime("2021-05-01",'%Y-%m-%d')
35
36
Out[2]:
37
0 2021-05-01 00:00:00
38
1 2021-05-01 00:00:00
39
2 2021-05-01 00:00:00
40
3 2021-05-01 00:00:00
41
4 2021-05-01 00:00:00
42
5 2021-05-01 00:00:00
43
6 2021-05-01 00:00:00
44
7 2021-05-01 00:00:00
45
8 2021-05-01 00:00:00
46
9 2021-05-01 00:00:00
47
48
In[3]:date['base'] = date['base'].dt.date
49
50
Out[3]:
51
base
52
0 2021-05-01
53
1 2021-05-01
54
2 2021-05-01
55
3 2021-05-01
56
4 2021-05-01
57
5 2021-05-01
58
6 2021-05-01
59
7 2021-05-01
60
8 2021-05-01
61
9 2021-05-01
62
63
In[4]:date['test']-date['base']
64
65
Out[4]:
66
diff
67
0 0 days 00:00:00.000000000
68
1 1 days 00:00:00.000000000
69
2 2 days 00:00:00.000000000
70
3 3 days 00:00:00.000000000
71
4 4 days 00:00:00.000000000
72
5 5 days 00:00:00.000000000
73
6 6 days 00:00:00.000000000
74
7 7 days 00:00:00.000000000
75
8 8 days 00:00:00.000000000
76
9 9 days 00:00:00.000000000
77
10 10 days 00:00:00.000000000
78
79
the only thing i could get is this. I don’t want anything other than the number 1-10 cuz i need them for further numerical calculation but i can’t get rid of those. Also how could i construct a time series which just outputs the date not the hms after it? i don’t want to manually .dt.date for all of those and it sometimes mess things up
Advertisement
Answer
You don’t need to create a column base
for this, simply do:
JavaScript
1
14
14
1
>>> (date['test'] - pd.to_datetime("2021-05-01", format='%Y-%m-%d')).dt.days
2
0 0
3
1 1
4
2 2
5
3 3
6
4 4
7
8
27 27
9
28 28
10
29 29
11
30 30
12
31 31
13
Name: test, dtype: int64
14