Skip to content
Advertisement

Nice file parsing required

I need to process a file like this

keyword,synonym,bidirectional
5487500j,54875,false
76x76,76 x 76,true
feuille,"papier,ramette",false
7843000j,78430,false

and I need to transform it to a dict :

{'5487500j':'54875', '76x76':'76 x 76','feuille':['papier','ramette'], '7843000j':'78430'}

I don’t succeed in any fast and elegant way to deal

Advertisement

Answer

Let me first specify what I have understood from your requirement.

  • You input is a csv file, with optionaly quoted fields: ok the csv module can parse it
  • The first field of each record will be used as a key in a dictionary
  • the third field is ignored
  • the second field will be the value in the dictionary. If it does not contain a comma, it will be used as is, else the value will be a splitted list

You should always write down in plain english (or whatever you first language is) a detailed specification of what you want to do before trying to code anything. Because the coding part should only be the translation of the spec.

Here the Python code (for my spec…) could be:

with open(inputfile) as fd:
    rd = csv.reader(fd)  # you files uses the default for quoting and delimiter
    _ = next(rd)         # skip header line
    result = {}
    for row in rd:
        result[row[0]] = row[1].split(',') if ',' in row[1] else row[1]

In fact a comprehension would be more Pythonic than the loop:

    result = {row[0]: row[1].split(',') if ',' in row[1] else row[1]
              for row in rd}
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement