I’m working with a function I made to split this sample line below to remove the standalone numerical values (123), however it’s also removing the trailing numbers which I need. I also can’t figure out how to remove the “0.0”
ABC/0.0/123/TT1/1TT//
JavaScript
x
10
10
1
cleaned_data = []
2
def split_lines(lines, delimiter, remove = '[0-9]+$'):
3
for line in lines:
4
tokens = line.split(delimiter)
5
tokens = [re.sub(remove, "", token) for token in tokens]
6
clean_list = list(filter(lambda e:e.strip(), tokens))
7
cleaned_data.append(clean_list)
8
print(clean_list)
9
split_lines(lines, "/")
10
What’s coming out now is below, notice the 0. and “TT” that’s missing the trailing 1.
[ABC], [0.], [TT], [1TT]
Advertisement
Answer
Try including the start of line anchor (^) as well.
JavaScript
1
10
10
1
cleaned_data = []
2
def split_lines(lines, delimiter, remove = '^[0-9.]+$'):
3
for line in lines:
4
tokens = line.split(delimiter)
5
tokens = [re.sub(remove, "", token) for token in tokens]
6
clean_list = list(filter(lambda e:e.strip(), tokens))
7
cleaned_data.append(clean_list)
8
print(clean_list)
9
split_lines(lines, "/")
10
I simply changed the default value of the remove
parameter to ‘^[0-9.]+$’ which only matches if the entire search string is numbers (or a period).