I am having a string in which some binary numbers are mentioned. I want to count number of occurrence of given pattern, but I want set my pattern above 7 digits of character, so the result should show only more than 7 characters. it means how I can set my pattern selection, so it should count only 7 digits and above (pattern = r”(0+1+0+1+0+1+)” {<7} ). please note: it should select 7 digits and above, not just fixed 7 digits.
JavaScript
x
16
16
1
import re
2
from collections import Counter
3
4
5
6
pattern = r"0+1+0+1+0+1+"
7
8
test_str = '01010100110011001100011100011110000101101110100001101011000111011001010011001001001101000011'
9
'00110011001100110011010101001100110001111110010100100111010110001001100010011010110011'
10
11
cnt = Counter(re.findall(pattern, test_str))
12
print(cnt.most_common())
13
14
# result [('010101', 2), ('00110001110001111', 1), ('011101000011', 1), ('0111011001', 1), ('010010011', 1), ('001100110011', 1), ('00011111100101', 1), ('010110001', 1)]
15
16
result should be show only more than 7 character it no supposed to show (‘010101’, 2)
Advertisement
Answer
The simplest solution is to filter the list of regex matches.
JavaScript
1
11
11
1
import re
2
from collections import Counter
3
4
pattern = r"0+1+0+1+0+1+"
5
6
test_str = '01010100110011001100011100011110000101101110100001101011000111011001010011001001001101000011'
7
'00110011001100110011010101001100110001111110010100100111010110001001100010011010110011'
8
9
cnt = Counter([p for p in re.findall(pattern, test_str) if len(p) > 6])
10
print(cnt.most_common())
11
Output:
JavaScript
1
2
1
[('001100110011', 2), ('000111000111100001', 1), ('011011101', 1), ('00001101011', 1), ('000111011001', 1), ('010011001', 1), ('001001101', 1), ('00001100110011', 1), ('00110011000111111', 1), ('00101001', 1), ('0011101011', 1), ('000100110001', 1), ('001101011', 1)]
2