My code seems to be outputting the list I want, however, when I try printing the list to CSV I do not get the same result on the .csv file for some reason. I am sure there’s something not right at the end of my code. Could anyone please shed some light? Thanks in advance.
import pandas as pd df = pd.read_csv('microRuleSet-row.csv') deduplicated_list = list() for index, row in df.iterrows(): for item in row: if item not in deduplicated_list: deduplicated_list.append(item) print(deduplicated_list) df.to_csv('microRuleSet-row-noDupes.csv', index=False)
Advertisement
Answer
I have not used pandas before. But it looks like you are outputting to csv the original microRuleSet-row.csv that you loaded. You have to export the deduplicated_list to csv. OK so each row must have no duplicated items. This code will do that. The first (header) row is now numbered 0 to 5. This can be changed to to the original heading, and adding placeholders for the extra empty csv cells.
import pandas as pd df = pd.read_csv('microRuleSet-row.csv') no_duplicates_list = [] for index, row in df.iterrows(): new_row = [] for item in row: if item not in new_row: new_row.append(item) no_duplicates_list.append(new_row) print(no_duplicates_list) df2 = pd.DataFrame(no_duplicates_list) df2.to_csv('microRuleSet-row-noDupes.csv', index=False)