I’m running below code to clean text
JavaScript
x
10
10
1
import pandas as pd
2
3
def not_regex(pattern):
4
return r"((?!{}).)".format(pattern)
5
6
tmp = pd.DataFrame(['No one has a European accent either @',
7
'That the kid reminds me of Kevin'])
8
9
tmp[0].str.replace(not_regex('(\b[-/]\b|[a-zA-Z0-9])'), ' ')
10
Then it returns a warning
JavaScript
1
3
1
<ipython-input-8-ef8a43f91dbd>:9: FutureWarning: The default value of regex will change from True to False in a future version.
2
tmp[0].str.replace(not_regex('(\b[-/]\b|[a-zA-Z0-9])'), ' ')
3
Could you please elaborate on the reason of this warning?
Advertisement
Answer
See Pandas 1.2.0 release notes:
The default value of regex for
Series.str.replace()
will change from True to False in a future release. In addition, single character regular expressions will not be treated as literal strings when regex=True is set (GH24804)
I.e., use regular expressions explicitly now:
JavaScript
1
2
1
dframe['colname'] = dframe['colname'].str.replace(r'D+', regex=True)
2