Skip to content
Advertisement

Question on regex not performing as expected

I am trying to change the suffixes of companies such that they are all in a common pattern such as Limited, Limiteed all to LTD.

Here is my code:

re.sub(r"s+?(CORPORATION|CORPORATE|CORPORATIO|CORPORATTION|CORPORATIF|CORPORATI|CORPORA|CORPORATN)", r" CORP", 'ABC CORPORATN')

I’m trying 'ABC CORPORATN' and it’s not converting it to CORP. I can’t see what the issue is. Any help would be great.

Edit: I have tried the other endings that I included in the regex and they all work except for corporatin (that I mentioned above)

Advertisement

Answer

I see that all te patterns begins with "CORPARA", so we can just go:

import re
print(re.sub("CORPORAw+", "CORP", 'ABC CORPORATN'))

Output:

ABC CORP

Same for the possible patterns of limited; if they all begin with "Limit", you can

import re
print(re.sub("Limitw+", "LTD", 'Shoe Shop Limited.'))

Output:

Shoe Shop LTD.
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement