I am trying to extract text between two words. The below pattern repeats itself with modifications in between ‘start keyword’ and ‘end keyword’ across the text document. The document has paragraphs and text before and after the following patterns, which i don’t want to extract. Can anyone help me with the regex for the following ? which would extract all occurrences.
Start keyword- RIASWIX End keyword – Sky Access
JavaScript
x
34
34
1
----Document Start-------
2
3
Paragraph*
4
5
RIASWIX.* ABCDEF1 NONE
6
WORKING: HELLO(READ)
7
BOOLEAN Access: SADGRE3, VJFKES3, JGJKEWW, IS4DWF44(A), DFEAWE2(G),
8
DW4444W, IHFK3MF3
9
BAZAAR Access: No resource with BAZAAR Access
10
GHAR Access: No resource with GHAR Access
11
WATER Access: ADMINDDD(A), GEDDE33
12
SKY None: No Resource with Sky Access
13
14
RIASWIX.@7483NFJ.* HFDFDF3 NONE
15
WORKING: BYE(READ)
16
BOOLEAN Access: GRREGGG, GREFEFF, GFGGGG, FDFDFDF(A), RERERE3(G),
17
GFFWEF44, FFRF44F
18
BAZAAR Access: No resource with BAZAAR Access
19
GHAR Access: No resource with GHAR Access
20
WATER Access: ADMINEWW(A), FFRFRGR
21
SKY None: No Resource with Sky Access
22
23
RIASWIX.@7483KXX.* HFDFDF3 NONE
24
WORKING: TATA(READ)
25
BOOLEAN Access: GRDSD33, FASDE, GFGGGG, RWERW33(A), NMUYHT4(G),
26
BAZAAR Access: XCDFEFE3, FREFE33R
27
GHAR Access: No resource with GHAR Access
28
WATER Access: DASDEFG(A), SJMFEIOE(P)
29
SKY None: No Resource with Sky Access
30
31
*Text
32
33
----Document End-------
34
Advertisement
Answer
(?s)
for new line characters, check this regex-match-all-characters-between-two-strings
JavaScript
1
4
1
import re
2
3
print(re.findall('RIASWIX(?s)(.*?)Sky Access', str1))
4