Pandas Multiindex to CSV without duplicate Index

Question

For the life of me I can't figure out how to remove the duplicate index items created when writing a multiindex dataframe to CSV. While there is this answer out there, it doesn't apply to me per se because my second level has all different values. This is a chunk of the dataframe I have, it just goes on for

Accepted Answer

It seems pandas does not provide directly such functionality, perhaps to ensure that the generated csv files can be read back, as mentioned in the comments above. I faced with the same problem when generating a table for report in rst format. Finally, I decided to process the generated csv by an external script. The script replaces the repeating values in columns with spaces, thus achieving desired result. Note: this script assumes commas as csv separators.# Clean csv with multiindicesfrom sys import argv# Positional command line arguments:#   1: original csv file#   2: number of columns to check (optional)with open(argv[1], 'r') as f:  if len(argv) > 2:    n = int(argv[2])    i0 = ('', ) * n  else:    n = None    i0 = None  for l in f:    if n is None:        # define n from the 1-st line        i1 = l.split(',')        n = len(i1)        i0 = ('', ) * n    i1 = l.split(',', n)    nl = ()    for s0, s1 in zip(i0, i1):        if s0 == s1:            e = ' ' * len(s0)        else:            e = s1        nl += (e, )    nl += tuple(i1[n:])    print ','.join(nl),    i0 = i1[:n]

Advertisement

Answer