Now I have a dataframe similar as follows
JavaScript
x
13
13
1
business_id date
2
0 --0r8K_AQ4FZfLsX3ZYRDA [2017-09-03 17:13:59]
3
1 --0zrn43LEaB4jUWTQH_Bg [2010-10-08 22:21:20, 2010-11-01 21:29:14, 2...
4
2 --164t1nclzzmca7eDiJMw [2010-02-26 02:06:53, 2010-02-27 08:00:09, 2...
5
3 --2aF9NhXnNVpDV0KS3xBQ [2014-11-03 16:35:35, 2015-01-30 18:16:03, 2...
6
4 --2mEJ63SC_8_08_jGgVIg [2010-12-15 17:10:46, 2013-12-28 00:27:54, 2...
7
8
997 -SjRCXID7eXewqloY3V86w [2015-12-13 02:48:00, 2016-01-21 22:31:31, 2...
9
998 -Sjrz1Mt9RY4r6ibxzGs0Q [2016-08-08 19:23:27, 2016-08-15 16:03:29, 2...
10
999 -Sk9ZND7V2x8RuauMH0FRw [2010-09-05 02:04:25, 2010-10-15 22:48:00, 2...
11
1000 -SkNedh2bJHPOcKfoFlTvg [2013-09-01 02:54:45, 2013-10-22 16:59:13, 2...
12
1001 -SkwKPbo5oK1-NtKkupNvw [2010-09-11 20:23:45, 2011-05-26 16:24:35, 2...
13
What I am trying to do is to
- Convert all the values in the list to date
- Filter the value which are only later than 2018-01-01
In the first step, what I tried to do is to use a apply function so that I can cover all elements in the list:
JavaScript
1
5
1
def convert_to_date(d):
2
pd.to_datetime(d, format='%Y-%m-%d %H:%M:%S')
3
4
checkin_data['date'].apply(convert_to_date)
5
However, the result was like this
JavaScript
1
13
13
1
0 None
2
1 None
3
2 None
4
3 None
5
4 None
6
7
997 None
8
998 None
9
999 None
10
1000 None
11
1001 None
12
Name: date, Length: 1002, dtype: object
13
How should I fix it? Thank you for your help!
Advertisement
Answer
Add return
for avoid missing values and filter greater values in boolean indexing
:
JavaScript
1
22
22
1
print (checkin_data)
2
business_id date
3
0 --0r8K_AQ4FZfLsX3ZYRDA [2022-09-03 17:13:59]
4
1 --0zrn43LEaB4jUWTQH_Bg [2018-10-08 22:21:20, 2010-11-01 21:29:14]
5
2 --164t1nclzzmca7eDiJMw [2019-02-26 02:06:53, 2030-02-27 08:00:09]
6
3 --2aF9NhXnNVpDV0KS3xBQ [2014-11-03 16:35:35, 2015-01-30 18:16:03]
7
4 --2mEJ63SC_8_08_jGgVIg [2010-12-15 17:10:46, 2013-12-28 00:27:54]
8
9
def convert_to_date(d):
10
x = pd.to_datetime(d)
11
return x[x > '2018-01-01'].tolist()
12
13
14
checkin_data['date'] = checkin_data['date'].apply(convert_to_date)
15
print (checkin_data)
16
business_id date
17
0 --0r8K_AQ4FZfLsX3ZYRDA [2022-09-03 17:13:59]
18
1 --0zrn43LEaB4jUWTQH_Bg [2018-10-08 22:21:20]
19
2 --164t1nclzzmca7eDiJMw [2019-02-26 02:06:53, 2030-02-27 08:00:09]
20
3 --2aF9NhXnNVpDV0KS3xBQ []
21
4 --2mEJ63SC_8_08_jGgVIg []
22