How to rename values in column having a specific separation symbols?

Question

Values in my DataFrame look like this: I want to remove everything in values of id column after _numeric. So desired result must look like: How to do that? I know that str.replace() can be used, but I don't understand how to write regular expression part in it. Answer You can use regex(re.search) to find the first occurence of _

Accepted Answer

You can use regex(re.search) to find the first occurence of _ + digit and then you can solve the problem.Code:import reimport pandas as pddef fix_id(id):    # Find the first occurence of: _ + digits in the id:    digit_search = re.search(r"_d", id)    return id[:digit_search.start()]# Your dfdf = pd.DataFrame({"id": ["big_val_167", "renv_100", "color_100", "color_60/write_10"],                   "val": [80, 100, 200, 200]})df["id"] = df["id"].apply(fix_id)print(df)Output:        id  val0  big_val   801     renv  1002    color  2003    color  200

Advertisement

Answer