I have a list of file directories that looks similar to this:
path/new/stuff/files/morefiles/A/file2.txt path/new/stuff/files/morefiles/B/file7.txt path/new/stuff/files/morefiles/A/file1.txt path/new/stuff/files/morefiles/C/file5.txt
I am trying to remove the beginnings of the paths that are the same from every list, and then deleting that from each file.
The list can be any length, and in the example I would be trying to change the list into:
A/file2.txt B/file7.txt A/file1.txt C/file5.txt
Methods like re.sub(r'.*I', 'I', filepath)
and filepath.split('_', 1)[-1]
can be used for the replacing, but I’m not sure about how to find the common parts in the list of filepaths
Note:
I am using Windows and python 3
Advertisement
Answer
The first part of the answer is here: Python: Determine prefix from a set of (similar) strings
Use os.path.commonprefix()
to find the longest common (first part) of the string
The code for selecting the part of the list that is the same as from that answer is:
# Return the longest prefix of all list elements. def commonprefix(m): "Given a list of pathnames, returns the longest common leading component" if not m: return '' s1 = min(m) s2 = max(m) for i, c in enumerate(s1): if c != s2[i]: return s1[:i] return s1
Now all you have to do is use slicing to remove the resulting string from each item in the list
This results in:
# Return the longest prefix of all list elements. def commonprefix(m): "Given a list of pathnames, returns the longest common leading component" if not m: return '' s1 = min(m) s2 = max(m) for i, c in enumerate(s1): if c != s2[i]: ans = s1[:i] break for each in range(len(m)): m[each] = m[each].split(ans, 1)[-1] return m