I have the following function to keep only ASCII characters:
import re def remove_special_chars(txt): txt = re.sub(r'[^x00-x7F]+', '', txt) return txt
But now I also want to keep the copyright symbol (©). What should I add to the pattern?
Advertisement
Answer
Add the copyright symbol’s hex xA9
(source) to your match group:
txt = re.sub(r'[^x00-x7FxA9]+', '', txt)