I have a column having strings of different number of characters. Most of rows have the following number of characters:
xx.xx.xxxx xx-xx-xx
but there are also rows having different number, for instance
xxx.xxx.xxxx xxxx xxxxxxxxxxxxxxx
I would like to replace those columns that have a number of characters different from xx.xx.xxxx xx-xx-xx
with a null value (e.g. NA).
My approach would be to calculate length of xx.xx.xxxx xx-xx-xx
and then filter rows which have a different number of characters: df[df['Char']!=len('xx.xx.xxxx xx-xx-xx')]
. But I would need also to replace the values of those rows.
Can you please tell me how to do it?
My column looks like
Char xx.xx.xxxx xx-xx-xx xxx.xxx.xxxx xxxx xxxxxxxxxxxxxxx xx.xx.xxxx xx-xx-xx xx.xx.xxxx xx-xx-xx xx.xx.xxxx xx-xx-xx xx.xx.xxxx xx-xx-xx
and my expected output would be
Char xx.xx.xxxx xx-xx-xx Na NA NA xx.xx.xxxx xx-xx-xx xx.xx.xxxx xx-xx-xx xx.xx.xxxx xx-xx-xx xx.xx.xxxx xx-xx-xx
Advertisement
Answer
Try with loc
df.loc[df['Char'].str.len()!=len('xx.xx.xxxx xx-xx-xx'),'Char']=np.nan