Skip to content
Advertisement

How to automatically fill blank column with None in pandas

I have a info.txt file it looks like this:

B 19960331 00100000 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19960430 00099100 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19960531 00098500 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19971000 20 31

And when I use pandas to read it:

import pandas as pd
import numpy as np
df =pd.read_csv('C:UsersPetterDesktopinfo.txt',sep=r"s", header=None, dtype=str, engine="python")
df

the error is:

ParserError: Expected 10 fields in line 153, saw 14. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

Is there any way to automatically fill the row that not the same column length, the output should looks like:

0   1   2   3   4   5   6   7   8   9
0   B   19960331    00100000    00000000000000  00000000000000  00000000000000  00000000    00000000000000  00000000000000  00000000000000
1   B   19960430    00099100    00000000000000  00000000000000  00000000000000  00000000    00000000000000  00000000000000  00000000000000
2   B   19971000    20          31              None            None            None  None None None
 

I mean every blank column will be fill with None

Advertisement

Answer

This works, and should(?) be the same as reading the file from disk:

import pandas as pd
import io

my_file = io.StringIO("""B 19960331 00100000 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19960430 00099100 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19960531 00098500 00000000000000 00000000000000 00000000000000 00000000 00000000000000 00000000000000 00000000000000
B 19971000 20 31""")

df = pd.read_csv(my_file, sep="s+", header=None)

output:

   0         1       2   3    4    5    6    7    8    9
0  B  19960331  100000   0  0.0  0.0  0.0  0.0  0.0  0.0
1  B  19960430   99100   0  0.0  0.0  0.0  0.0  0.0  0.0
2  B  19960531   98500   0  0.0  0.0  0.0  0.0  0.0  0.0
3  B  19971000      20  31  NaN  NaN  NaN  NaN  NaN  NaN
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement