I need to scrape hundreds of pages and instead of storing the whole json of each page, I want to just store several columns from each page into a pandas dataframe. However, at the beginning when the dataframe is empty, I have a problem. I need to fill an empty dataframe without any columns or rows. So the loop below is not working correctly:
import pandas as pd import requests cids = [4100,4101,4102,4103,4104] df = pd.DataFrame() for i in cids: url_info = requests.get(f'myurl/{i}/profile') jdata = url_info.json() df['Customer_id'] = i df['Name'] = jdata['user']['profile']['Name'] ...
In this case, what should I do?
Advertisement
Answer
You can solve this by using enumerate()
, together with loc
:
for index, i in enumerate(cids): url_info = requests.get(f'myurl/{i}/profile') jdata = url_info.json() df.loc[index, 'Customer_id'] = i df.loc[index, 'Name'] = jdata['user']['profile']['Name']