Using Python and regex, would you please help converting
First sentence.Second sentence. Third 3.4 sentence3.Fourth sentence. Fifth5. Sixth.
to
First sentence. Second sentence. Third 3.4 sentence3. Fourth sentence. Fifth5. Sixth.
i.e., space needs to be inserted when it is between two alphabets and either one of the characters on either side is an alphabet. Space need NOT be inserted if both sides of the period is a digit. Please help.
Thanks in advance.
Advertisement
Answer
You can use negative lookahead assertion to make sure there is no white space, $
, or any digits after the period.
You can use re.sub
to substitute such occurrences of period with space after period.
>>> import re >>> text = 'First sentence.Second sentence. Third 3.4 sentence3.Fourth sentence. Fifth5. Sixth.' >>> re.sub('.(?!s|d|$)', '. ', text) 'First sentence. Second sentence. Third 3.4 sentence3. Fourth sentence. Fifth5. Sixth.'