Skip to content
Advertisement

Wrong result with regular expressions

Any idea why the regular expression below cuts the ‘fl’ part of the sentence ?

re.sub('[^a-zA-Z]', ' ', 'nFor a this river, the flow becomes complicated in the floodplain')

This is the result I get :

'For a this river  the  ow becomes complicated in the  oodplain'

Advertisement

Answer

You’re replacing all non-alphabetical characters with whitespace.

In your code, the ‘fl’ is actually – a single unicode (non-AZ) character, so it is being removed, along with the punctuation.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement