Skip to content
Advertisement

Python regular expression how to deal with multiple back slash

I’m dealing with text data and having problem erasing multiple back slashes. I found out that using .sub works quite well. So I coded as below to erase back slash+r n t f v

JavaScript

However, the code above can’t deal with the string below.

JavaScript

So coded as this:

JavaScript

But it’s showing result like this..

I don’t know why this happens.

Erasing all the v,f,n and so on..

I found out using .replace(“\\r”,” ”) works! However,in this way, i should go like..

JavaScript

I’m pretty sure there’d be better way..

Advertisement

Answer

You can’t define a sequence of characters inside a character class. Character classes are meant to match a single character. So, [\\t\\n\\r\\f\\v] is equal to [\tnrfv] and matches either a backslash, or t, n, r, f or v letters.

To match a sequence of chars, you need to use them one by one. To match a n two-char string you need to use \n pattern (r'\n'). If you need to match either n or v texts you would need to use either \n|\v, (?:\n|\v) or better \[nv].

So, if you want to match a backslash followed with a letter from the rtnfv char set, or "t" (TAB), "n" (line feed), "r" (carriage return), "f" (form feed) or "v" (vertical tab) chars you can use

JavaScript

The last one matches one or more consecutive occurrences of the patterns that may be mixed with each other.

Advertisement