Skip to content
Advertisement

Not finding a good regex pattern to substitute the strings in a correct order(python)

I have a list of column names that are in string format like below:

lst = ["plug", "[plug+wallet]", "(wallet-phone)"]

Now I want to add df[] with " ' " to each column name using regex and I did it which does that when the list has (wallet-phone) this kind of string it gives an output like this df[('wallet']-df['phone')]. How do I get like this (df['wallet']-df['phone']), Is my pattern wrong. Please refer it below:

import re
lst = ["plug", "[plug+wallet]", "(wallet-phone)"]
x=[]
y=[]
for l in lst: 
    x.append(re.sub(r"([^+-*/'d]+)", r"'1'", l))
    for f in x:    
        y.append(re.sub(r"('[^+-*/'d]+')", r'df[1]',f))

print(x)
print(y)

gives:

x:["'plug'", "'[plug'+'wallet]'", "'(wallet'-'phone)'"]
y:["df['plug']", "df['[plug']+df['wallet]']", "df['(wallet']-df['phone)']"]

Is the pattern wrong? Expected output:

x:["'plug'", "['plug'+'wallet']", "('wallet'-'phone')"]
y:["df['plug']", "[df['plug']+df['wallet']]", "(df['wallet']-df['phone'])"]

I also tried ([^+-*/()[]'d]+) this pattern but it isn’t avoiding () or []

Advertisement

Answer

It might be easier to locate words and enclose them in the dictionary reference:

import re
lst = ["plug", "[plug+wallet]", "(wallet-phone)"]

z = [re.sub(r"(w+)",r"df['1']",w) for w in lst]

print(z)
["df['plug']", "[df['plug']+df['wallet']]", "(df['wallet']-df['phone'])"]
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement