I have directories that contains ‘x..’ characters such as ‘x00’ :
#ls cx00mb
and I want to rename them without these, since when I copy those files to windows they become unusable. So my python script is going through those directories and detecting the problematic characters the following way:
if '\x' in dir: # dir is the name of the current directory
First I thought I could get rid of this problem by using the re
module in python:
new_dir_name = re.sub('x00', r'', dir) # I am using x00 as an example
But this didn’t work. Is there a way I can replace thoses characters using python?
EDIT :
in order to understand the char, when I pipe ls
to xxd
the ” character appears in the ascii representation. In hexadecimal it shows ‘5c’
Advertisement
Answer
This string.replace worked for me:
dir = r'foox00bar' print dir dir.replace(r'x00', '') print dir
Output is:
foox00bar foobar
string.replace(s, old, new[, maxreplace])
Return a copy of string s with all occurrences of substring old replaced by new. If the optional argument maxreplace is given, the first maxreplace occurrences are replaced.
A regular expression could also work for the general case, but you’ll have to escape the backslash so that x
itself isn’t interpreted as a regular expression escape.
For the general case of removing x
followed by two hexadecimal digits:
import re dir = r'foox1Dbar' print dir re.sub(r'\x[0-9A-F]{2}', '', dir) print dir
Output is:
foox1Dbar foobar