I need to scrape hundreds of pages and instead of storing the whole json of each page, I want to just store several columns from each page into a pandas dataframe. However, at the beginning when the dataframe is empty, I have a problem. I need to fill an empty dataframe without any columns or rows. So the loop below is not working correctly:
JavaScript
x
15
15
1
import pandas as pd
2
import requests
3
4
5
cids = [4100,4101,4102,4103,4104]
6
df = pd.DataFrame()
7
8
for i in cids:
9
url_info = requests.get(f'myurl/{i}/profile')
10
jdata = url_info.json()
11
df['Customer_id'] = i
12
df['Name'] = jdata['user']['profile']['Name']
13
14
15
In this case, what should I do?
Advertisement
Answer
You can solve this by using enumerate()
, together with loc
:
JavaScript
1
6
1
for index, i in enumerate(cids):
2
url_info = requests.get(f'myurl/{i}/profile')
3
jdata = url_info.json()
4
df.loc[index, 'Customer_id'] = i
5
df.loc[index, 'Name'] = jdata['user']['profile']['Name']
6