I have a column having strings of different number of characters. Most of rows have the following number of characters: but there are also rows having different number, for instance I would like to replace those columns that have a number of characters different from xx.xx.xxxx xx-xx-xx with a null value (e.g. NA). My approach would be to calculate length

Replace rows with different number of characters

I have a column having strings of different number of characters. Most of rows have the following number of characters:

xx.xx.xxxx xx-xx-xx

but there are also rows having different number, for instance

xxx.xxx.xxxx
xxxx
xxxxxxxxxxxxxxx

I would like to replace those columns that have a number of characters different from xx.xx.xxxx xx-xx-xx with a null value (e.g. NA). My approach would be to calculate length of xx.xx.xxxx xx-xx-xx and then filter rows which have a different number of characters: df[df['Char']!=len('xx.xx.xxxx xx-xx-xx')]. But I would need also to replace the values of those rows. Can you please tell me how to do it?

My column looks like

Char 
xx.xx.xxxx xx-xx-xx
xxx.xxx.xxxx
xxxx
xxxxxxxxxxxxxxx
xx.xx.xxxx xx-xx-xx
xx.xx.xxxx xx-xx-xx
xx.xx.xxxx xx-xx-xx
xx.xx.xxxx xx-xx-xx

and my expected output would be

Char 
xx.xx.xxxx xx-xx-xx
Na
NA
NA
xx.xx.xxxx xx-xx-xx
xx.xx.xxxx xx-xx-xx
xx.xx.xxxx xx-xx-xx
xx.xx.xxxx xx-xx-xx

Answer

Try with loc

df.loc[df['Char'].str.len()!=len('xx.xx.xxxx xx-xx-xx'),'Char']=np.nan

Advertisement

Answer