I have a info.txt file it looks like this:
B 19960331 00100000 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000 B 19960430 00099100 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000 B 19960531 00098500 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000 B 19971000 20 31
And when I use pandas to read it:
import pandas as pd import numpy as np df =pd.read_csv('C:UsersPetterDesktopinfo.txt',sep=r"s", header=None, dtype=str, engine="python") df
the error is:
ParserError: Expected 10 fields in line 153, saw 14. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.
Is there any way to automatically fill the row that not the same column length, the output should looks like:
0 1 2 3 4 5 6 7 8 9 0 B 19960331 00100000 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000 1 B 19960430 00099100 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000 2 B 19971000 20 31 None None None None None None
I mean every blank column will be fill with None
Advertisement
Answer
This works, and should(?) be the same as reading the file from disk:
import pandas as pd import io my_file = io.StringIO("""B 19960331 00100000 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000 B 19960430 00099100 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000 B 19960531 00098500 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000 B 19971000 20 31""") df = pd.read_csv(my_file, sep="s+", header=None)
output:
0 1 2 3 4 5 6 7 8 9 0 B 19960331 100000 0 0.0 0.0 0.0 0.0 0.0 0.0 1 B 19960430 99100 0 0.0 0.0 0.0 0.0 0.0 0.0 2 B 19960531 98500 0 0.0 0.0 0.0 0.0 0.0 0.0 3 B 19971000 20 31 NaN NaN NaN NaN NaN NaN