Skip to content
Advertisement

How to remove all non-numeric characters (except operators) from a string in Python?

I would like to remove all non-numeric characters from a string, except operators such as +,-,*,/, and then later evaluate it. For example, suppose the input is 'What is 2+2?' The result should be '2+2' – keeping only the operator symbols and numeric digits.

How can I do this in Python? I tried this so far, but can it be improved?

def evaluate(splitted_cm):
    try:
        splitted_cm = splitted_cm.replace('x', '*').replace('?', '')
        digs = [x.isdigit() for x in splitted_cm]
        t = [i for i, x in enumerate(digs) if x]
        answer = eval(splitted_cm[t[0]:t[-1] + 1])
        return str(answer)

    except Exception as err:
        print(err)

Advertisement

Answer

You can use regex and re.sub() to make substitutions.

For example:

expression = re.sub("[^d+-/÷%%*]*", "", text)

will eliminate everything that is not a number or any of +-/÷%*. Obviously, is up to you to make a comprehensive list of the operators you want to keep.

That said, I’m going to paste here @KarlKnechtel’s comment, literally:

Do not use eval() for anything that could possibly receive input from outside the program in any form. It is a critical security risk that allows the creator of that input to execute arbitrary code on your computer.

Advertisement