I am scraping Glassdoor companies. There is a benefits section which will be different for every company, so I am grabbing all the elements in the div. That returns a list. Is there anyway to convert these into key:value
pairs while scraping. They need to be written to a csv
inside a loop.
dict1 = {"Company URL": company_url, "Over All Rating": over_all_rating, "Most Commented":final, "Benefits":benefits, "Benefits List":all_ben_list} with open('Glassdoor.csv', 'a+', encoding='utf-8-sig') as f: w = csv.DictWriter(f, dict1.keys()) if not header_added: w.writeheader()
As of now, I am writing the list in 1 column like so.
Benefits ['Health Care & Insurance (2747)', 'Life Insurance (2423)', 'Disability Insurance (674)', 'Dental..]
Ideally it should be like this.
Health Care & Insurance Life Insurance Disability Insurance.. 2747 2423 674
I have tried but cannot think of a way to do this because characters are returned one at a time, so cannot find a way to make the headline text key
and the value of it as value
and I cannot come up with a way to create a dict
out of this. Any help is much appreciated.
Advertisement
Answer
you can use the string find method to find the brackets and extract the value to save it into a dictionary
Benefits=['Health Care & Insurance (2747)', 'Life Insurance (2423)', 'Disability Insurance (674)'] info_dict={} for benift in Benefits: key= benift[:benift.find("(")] value = benift[benift.find("(")+1:benift.find(")")] info_dict[key]=int(value) print(info_dict)
Output is :
{'Health Care & Insurance ': 2747, 'Life Insurance ': 2423, 'Disability Insurance ': 674}