I’m trying to use python and regex to get the last set of integers in a filename (string) Which the method does what i need, however I want to also return the inverse or remaining parts of the regex. How can i do that?
Here is the regex ([0-9]+|#+)(?!.*([0-9]+|#+))
JavaScript
x
25
25
1
import re
2
3
values = [
4
'image.0001',
5
'image###',
6
'###image###',
7
'image001',
8
'image_001',
9
'001',
10
'0001.image',
11
'001image',
12
'001_image',
13
'image',
14
'01_image01',
15
'03_image01',
16
]
17
18
pattern = '([0-9]+|#+|@+)'
19
regex = '{0}(?!.*{0})'.format(pattern)
20
21
for v in values:
22
result = re.search(regex, v)
23
if result:
24
print result.groups()
25
Currently it is returning…. ('01', None)
I’d like it to return something like ('image', '0001')
Updated
Optionally is there a way to split the strings by groups of numbers…for example
JavaScript
1
13
13
1
'image.0001' > ['image.', '0001']
2
'image###' > ['image', '###']
3
'###image###' > ['###', 'image', '###']
4
'image001' > ['image', '001']
5
'image_001' > ['image_', '001']
6
'001' > ['001']
7
'0001.image' > ['0001', '.image']
8
'001image' > ['001', 'image']
9
'001_image' > ['001', '_image']
10
'image' > ['image']
11
'01_image01' > ['01', '_image', '01']
12
'03_image01' > ['03', '_image', '01']
13
Advertisement
Answer
EDIT:
Use
JavaScript
1
2
1
re.findall(r'd+|#+|@+|[^#@d]+', v)
2
See proof.
Explanation
JavaScript
1
20
20
1
--------------------------------------------------------------------------------
2
d+ digits (0-9) (1 or more times (matching
3
the most amount possible))
4
--------------------------------------------------------------------------------
5
| OR
6
--------------------------------------------------------------------------------
7
#+ '#' (1 or more times (matching the most
8
amount possible))
9
--------------------------------------------------------------------------------
10
| OR
11
--------------------------------------------------------------------------------
12
@+ '@' (1 or more times (matching the most
13
amount possible))
14
--------------------------------------------------------------------------------
15
| OR
16
--------------------------------------------------------------------------------
17
[^#@d]+ any character except: '#', '@', digits (0-
18
9) (1 or more times (matching the most
19
amount possible))
20
ORIGINAL:
Use re.split
, add capturing group to keep captured part inside the result:
JavaScript
1
22
22
1
import re
2
3
values = [
4
'image.0001',
5
'image###',
6
'###image###',
7
'image001',
8
'image_001',
9
'001',
10
'0001.image',
11
'001image',
12
'001_image',
13
'image',
14
'01_image01',
15
'03_image01',
16
]
17
18
pattern = '[0-9]+|#+|@+'
19
regex = re.compile(r'({0})(?!.*(?:{0}))'.format(pattern))
20
for v in values:
21
print(regex.split(v))
22
See Python proof
Results:
JavaScript
1
13
13
1
['image.', '0001', '']
2
['image', '###', '']
3
['###image', '###', '']
4
['image', '001', '']
5
['image_', '001', '']
6
['', '001', '']
7
['', '0001', '.image']
8
['', '001', 'image']
9
['', '001', '_image']
10
['image']
11
['01_image', '01', '']
12
['03_image', '01', '']
13
See regex proof.
Explanation
JavaScript
1
44
44
1
--------------------------------------------------------------------------------
2
( group and capture to 1:
3
--------------------------------------------------------------------------------
4
[0-9]+ any character of: '0' to '9' (1 or more
5
times (matching the most amount
6
possible))
7
--------------------------------------------------------------------------------
8
| OR
9
--------------------------------------------------------------------------------
10
#+ '#' (1 or more times (matching the most
11
amount possible))
12
--------------------------------------------------------------------------------
13
| OR
14
--------------------------------------------------------------------------------
15
@+ '@' (1 or more times (matching the most
16
amount possible))
17
--------------------------------------------------------------------------------
18
) end of 1
19
--------------------------------------------------------------------------------
20
(?! look ahead to see if there is not:
21
--------------------------------------------------------------------------------
22
.* any character except n (0 or more times
23
(matching the most amount possible))
24
--------------------------------------------------------------------------------
25
(?: group, but do not capture:
26
--------------------------------------------------------------------------------
27
[0-9]+ any character of: '0' to '9' (1 or
28
more times (matching the most amount
29
possible))
30
--------------------------------------------------------------------------------
31
| OR
32
--------------------------------------------------------------------------------
33
#+ '#' (1 or more times (matching the
34
most amount possible))
35
--------------------------------------------------------------------------------
36
| OR
37
--------------------------------------------------------------------------------
38
@+ '@' (1 or more times (matching the
39
most amount possible))
40
--------------------------------------------------------------------------------
41
) end of grouping
42
--------------------------------------------------------------------------------
43
) end of look-ahead
44