The below code is to lookup a dictionary and replace string with values corresponding to dict’s key.
JavaScript
x
6
1
d = {"lh": "left hand"}
2
sentence = "lh l.h. lh. -lh- l.h .lh plh phli lhp 1lh lh1"
3
pattern_replace = r'(?<!(w|d))(.)?({})(.)?(?!(w|d))'.format('|'.join(sorted(re.escape(k) for k in d)))
4
sentence = re.sub(pattern_replace, lambda m: d.get(m.group(0)), sentence, flags=re.IGNORECASE)
5
sentence
6
Can someone help me understand why my code omits certain words?
It removes lh
preceeded and followed with a .
i.e., lh.
and .lh
. How to overcome this?
I get the output left hand l.h. -left hand- l.h plh phli lhp 1lh lh1
Advertisement
Answer
Because in the lookup dict you need to get capture group 3 instead of the whole match with m.group(0)
Note that w
also matches d
.
Now your pattern looks like:
JavaScript
1
2
1
(?<!(w|d))(.)?(lh)(.)?(?!(w|d))
2
But you can rewrite the structure of the pattern to just use group 1 m.group(1)
for the dict key:
JavaScript
1
4
1
(?<!w).?(lh).?(?!w)
2
^^
3
dict key
4