I have the following function to keep only ASCII characters:
import re def remove_special_chars(txt): txt = re.sub(r'[^x00-x7F]+', '', txt) return txt
But now I also want to keep the copyright symbol (Â©). What should I add to the pattern?
Add the copyright symbol’s hex
xA9 (source) to your match group:
txt = re.sub(r'[^x00-x7FxA9]+', '', txt)