Joining two CSV files with common column in Python without Pandas

Question

I wanted to inquire on how I could merge 2 csv files so that I can generate queries. Note that I am not allowed to use the "pandas" library. As an example I have these 2 csv: data.csv: enti.csv: And what I'm looking for is to be able to join them through cod_enti and thus be able to evaluate the

Accepted Answer

You can do this in pretty straight-forward approach with just the CSV module.I create a map of each row in data.csv to its code_enti value.  Then, for every row in enti.csv that has a matching code_enti, I update the row in the map:import csvimport pprint# Create a mapping of a data row to its cod_enti, e.g.:# {208: {cod_pers:2317422, cod_enti:208, fec_venc:04/12/2022}, ...}cod_enti_row_map = {}with open("data.csv", newline="") as f:    reader = csv.DictReader(f, skipinitialspace=True)  # because your header row has leading spaces    for row in reader:        cod_enti = row["cod_enti"]        cod_enti_row_map[cod_enti] = rowprint(f"Map before join")pprint.pprint(cod_enti_row_map, width=100, sort_dicts=False)# Now, update each row in the map with cod_market for the key, cod_entiwith open("enti.csv", newline="") as f:    reader = csv.DictReader(f)    for row in reader:        cod_enti = row["cod_enti"]        # skip cod_enti in enti.csv that is not in data.csv, like 209        if cod_enti not in cod_enti_row_map:            continue        cod_enti_row_map[cod_enti].update(row)print(f"Map after join")pprint.pprint(cod_enti_row_map, width=100, sort_dicts=False)Here&#8217;s what I get when I run that:Map before join{'208': {'cod_pers': '2317422', 'cod_enti': '208', 'fec_venc': '04/12/2022'}, '210': {'cod_pers': '2392115', 'cod_enti': '210', 'fec_venc': '04/02/2022'}, '211': {'cod_pers': '2086638', 'cod_enti': '211', 'fec_venc': '31/03/2022'}, '212': {'cod_pers': '2086638', 'cod_enti': '212', 'fec_venc': '03/13/2022'}}Map after join{'208': {'cod_pers': '2317422', 'cod_enti': '208', 'fec_venc': '04/12/2022', 'cod_market': '40'}, '210': {'cod_pers': '2392115', 'cod_enti': '210', 'fec_venc': '04/02/2022', 'cod_market': '16'}, '211': {'cod_pers': '2086638', 'cod_enti': '211', 'fec_venc': '31/03/2022', 'cod_market': '40'}, '212': {'cod_pers': '2086638', 'cod_enti': '212', 'fec_venc': '03/13/2022', 'cod_market': '50'}}From there, you can extract the rows into a normal list and do all your filtering by key-value, or however else.

Advertisement

Answer