Skip to content
Advertisement

Replacing ‘x..’ litteral string in Python

I have directories that contains ‘x..’ characters such as ‘x00’ :

#ls
cx00mb

and I want to rename them without these, since when I copy those files to windows they become unusable. So my python script is going through those directories and detecting the problematic characters the following way:

if '\x' in dir: # dir is the name of the current directory

First I thought I could get rid of this problem by using the re module in python:

new_dir_name = re.sub('x00', r'', dir) # I am using x00 as an example

But this didn’t work. Is there a way I can replace thoses characters using python?

EDIT : in order to understand the char, when I pipe ls to xxd the ” character appears in the ascii representation. In hexadecimal it shows ‘5c’

Advertisement

Answer

This string.replace worked for me:

dir = r'foox00bar'
print dir
dir.replace(r'x00', '')
print dir

Output is:

foox00bar
foobar

string.replace(s, old, new[, maxreplace])

Return a copy of string s with all occurrences of substring old replaced by new. If the optional argument maxreplace is given, the first maxreplace occurrences are replaced.

A regular expression could also work for the general case, but you’ll have to escape the backslash so that x itself isn’t interpreted as a regular expression escape.

For the general case of removing x followed by two hexadecimal digits:

import re
dir = r'foox1Dbar'
print dir
re.sub(r'\x[0-9A-F]{2}', '', dir)
print dir

Output is:

foox1Dbar
foobar
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement