after exploring one row in an API example, I found the whole information
df['items'][0]
{'tags': ['perl'],
'owner': {'reputation': 93,
'user_id': 6536089,
'user_type': 'registered',
'accept_rate': 0,
'profile_image': 'https://www.gravatar.com/avatar/f8b30a65d171e2a305745589dc02caba?s=256&d=identicon&r=PG&f=1',
'display_name': 'andy',
'link': 'https://stackoverflow.com/users/6536089/andy'},
'score': 0,
'last_activity_date': 1658173974,
'creation_date': 1658110836, # <----
'last_edit_date': 1658173974, # <----
'question_id': 73016722}
I been using this code to obtain the creation_date values:
df['items'].apply(lambda value : value['creation_date'] if isinstance(value, dict) else np.nan)
Here is where I got stuck. I found that some rows doesn’t have last_edit_date values.
When I try to run the same code using the name last_edit_date I get an error.
df['items'].apply(lambda value : value['last_edit_date'] if isinstance(value, dict) else np.nan)
KeyError: ‘last_edit_date’
Advertisement
Answer
You can simplify your code a lot by using Series.str.get:
Given:
items
0 {'tags': ['perl'], 'owner': {'reputation': 93,...
Doing:
df['last_edit_date'] = df['items'].str.get('last_edit_date')
print(df)
Output:
items last_edit_date
0 {'tags': ['perl'], 'owner': {'reputation': 93,... 1658173974