Now I have a dataframe similar as follows
business_id date 0 --0r8K_AQ4FZfLsX3ZYRDA [2017-09-03 17:13:59] 1 --0zrn43LEaB4jUWTQH_Bg [2010-10-08 22:21:20, 2010-11-01 21:29:14, 2... 2 --164t1nclzzmca7eDiJMw [2010-02-26 02:06:53, 2010-02-27 08:00:09, 2... 3 --2aF9NhXnNVpDV0KS3xBQ [2014-11-03 16:35:35, 2015-01-30 18:16:03, 2... 4 --2mEJ63SC_8_08_jGgVIg [2010-12-15 17:10:46, 2013-12-28 00:27:54, 2... ... ... ... 997 -SjRCXID7eXewqloY3V86w [2015-12-13 02:48:00, 2016-01-21 22:31:31, 2... 998 -Sjrz1Mt9RY4r6ibxzGs0Q [2016-08-08 19:23:27, 2016-08-15 16:03:29, 2... 999 -Sk9ZND7V2x8RuauMH0FRw [2010-09-05 02:04:25, 2010-10-15 22:48:00, 2... 1000 -SkNedh2bJHPOcKfoFlTvg [2013-09-01 02:54:45, 2013-10-22 16:59:13, 2... 1001 -SkwKPbo5oK1-NtKkupNvw [2010-09-11 20:23:45, 2011-05-26 16:24:35, 2...
What I am trying to do is to
- Convert all the values in the list to date
- Filter the value which are only later than 2018-01-01
In the first step, what I tried to do is to use a apply function so that I can cover all elements in the list:
def convert_to_date(d): pd.to_datetime(d, format='%Y-%m-%d %H:%M:%S') checkin_data['date'].apply(convert_to_date)
However, the result was like this
0 None 1 None 2 None 3 None 4 None ... 997 None 998 None 999 None 1000 None 1001 None Name: date, Length: 1002, dtype: object
How should I fix it? Thank you for your help!
Advertisement
Answer
Add return
for avoid missing values and filter greater values in boolean indexing
:
print (checkin_data) business_id date 0 --0r8K_AQ4FZfLsX3ZYRDA [2022-09-03 17:13:59] 1 --0zrn43LEaB4jUWTQH_Bg [2018-10-08 22:21:20, 2010-11-01 21:29:14] 2 --164t1nclzzmca7eDiJMw [2019-02-26 02:06:53, 2030-02-27 08:00:09] 3 --2aF9NhXnNVpDV0KS3xBQ [2014-11-03 16:35:35, 2015-01-30 18:16:03] 4 --2mEJ63SC_8_08_jGgVIg [2010-12-15 17:10:46, 2013-12-28 00:27:54] def convert_to_date(d): x = pd.to_datetime(d) return x[x > '2018-01-01'].tolist() checkin_data['date'] = checkin_data['date'].apply(convert_to_date) print (checkin_data) business_id date 0 --0r8K_AQ4FZfLsX3ZYRDA [2022-09-03 17:13:59] 1 --0zrn43LEaB4jUWTQH_Bg [2018-10-08 22:21:20] 2 --164t1nclzzmca7eDiJMw [2019-02-26 02:06:53, 2030-02-27 08:00:09] 3 --2aF9NhXnNVpDV0KS3xBQ [] 4 --2mEJ63SC_8_08_jGgVIg []