I would like to know how to save data from my python dictionary (being created) to a CSV file at the same time (i.e. as soon as a python dictionary line is created it should be sent directly to the CSV file)
I’m using the following code :
data = [] with open('urls.txt', 'r') as inf: for row in inf: url = row.strip() response = requests.get(url, headers={'User-agent': 'Mozilla/5.0'}) if response.ok: try: soup = BeautifulSoup(response.text,"html.parser") text = soup.select_one('div.para_content_text').get_text(strip=True) topic = soup.select_one('div.article_tags_topics').get_text(strip=True) tags = soup.select_one('div.article_tags_tags').get_text(strip=True) except AttributeError: print (" ") data.append( { 'text':text, 'topic': topic, 'tags':tags } ) pd.DataFrame(data).to_csv('text.csv', index = False, header=True) time.sleep(3)
I would like to obtain a fisrt column for text, topic and tags Do you have an idea how to change my 2 steps code (=dictionary conception then convert it to CSV) to a dynamic one ?
Advertisement
Answer
I reshuffled your code a bit: 1. I moved data.append to the try block. Otherwise the data would not be appended. 2. I moved df.to_csv to the try block as well, which makes that the csv will be re-saved every time new data is appended to the list.
data = [] with open('urls.txt', 'r') as inf: for row in inf: url = row.strip() response = requests.get(url, headers={'User-agent': 'Mozilla/5.0'}) if response.ok: try: soup = BeautifulSoup(response.text,"html.parser") text = soup.select_one('div.para_content_text').get_text(strip=True) topic = soup.select_one('div.article_tags_topics').get_text(strip=True) tags = soup.select_one('div.article_tags_tags').get_text(strip=True) data.append( { 'text':text, 'topic': topic, 'tags':tags } ) pd.DataFrame(data).to_csv('text.csv', index = False, header=True) except AttributeError: print (" ") time.sleep(3)