If I have an input something like this
input = 'AB. Hello word.' the output should be output = 'Hello word.'
Another example is
input = 'AB′. Hello word' output = Hello Word
I want to produce a code which is generalized for any group of letter in any language. This is my code
text = 'A. Hello word.'
text = re.sub(r'A. w{1,2}.*', '', text)
text
output = llo word.
So I can change ‘A’ with any other letter, but for some reason isn’t working well.
I tried also this one
text = 'Ab. Hello word.'
text = re.sub(r'A+. w{1,2}.*', '', text)
text
output = Ab. Hello word.
but isn’t working as well.
Advertisement
Answer
Try this:
import re
regex = r"^[^.]{1,3}.s*"
test_str = ("AB. Hello word.n"
"AB′. Hello word.n"
"A. Hello word.n"
"Ab. Hello word.n")
subst = ""
# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)
if result:
print (result)
Output:
Hello word. Hello word. Hello word. Hello word.