I have a column having strings of different number of characters. Most of rows have the following number of characters:
JavaScript
x
2
1
xx.xx.xxxx xx-xx-xx
2
but there are also rows having different number, for instance
JavaScript
1
4
1
xxx.xxx.xxxx
2
xxxx
3
xxxxxxxxxxxxxxx
4
I would like to replace those columns that have a number of characters different from xx.xx.xxxx xx-xx-xx
with a null value (e.g. NA).
My approach would be to calculate length of xx.xx.xxxx xx-xx-xx
and then filter rows which have a different number of characters: df[df['Char']!=len('xx.xx.xxxx xx-xx-xx')]
. But I would need also to replace the values of those rows.
Can you please tell me how to do it?
My column looks like
JavaScript
1
10
10
1
Char
2
xx.xx.xxxx xx-xx-xx
3
xxx.xxx.xxxx
4
xxxx
5
xxxxxxxxxxxxxxx
6
xx.xx.xxxx xx-xx-xx
7
xx.xx.xxxx xx-xx-xx
8
xx.xx.xxxx xx-xx-xx
9
xx.xx.xxxx xx-xx-xx
10
and my expected output would be
JavaScript
1
10
10
1
Char
2
xx.xx.xxxx xx-xx-xx
3
Na
4
NA
5
NA
6
xx.xx.xxxx xx-xx-xx
7
xx.xx.xxxx xx-xx-xx
8
xx.xx.xxxx xx-xx-xx
9
xx.xx.xxxx xx-xx-xx
10
Advertisement
Answer
Try with loc
JavaScript
1
2
1
df.loc[df['Char'].str.len()!=len('xx.xx.xxxx xx-xx-xx'),'Char']=np.nan
2