Skip to content
Advertisement

convert multiple units to KG in pandas

let’s say i have this code(which is obviously wrong)

def conditions(df):
    value = df["weight"])
    unit = df["weight_unit"]
    if(unit.lower() == "pound"):
        return value / 2.2
    elif(unit.lower() == "metric ton"):
        return value * 1000
    elif(unit.lower() == "long ton"):
        return value * 1016
    elif(unit.lower() in ("measurement ton", "short ton")):
        return value * 907

def convert_to_kilo(df):
    func = np.vectorize(conditions)
    to_kilo = func(df)
    df["weight"] = to_kilo
    df["weight_unit"] = "Kilograms"

I want to apply such condition to each value in a column(weights) based on another column(weight unit). Is there an efficient way to do it. Preferably allowing a func pass so easy to modify

Advertisement

Answer

Don’t use a function, this will be slow. numpy.vectorize does not vectorize in C-speed, but rather “pseudo-vectorizes” using an internal loop.

Use map instead:

units = {'pound': 1/2.2, 'metric ton': 1000, 'long ton': 1016,
         'measurement ton': 907, 'short ton': 907,
        }

df['weight'] *= df['weight_unit'].str.lower().map(units)
df['weight_unit'] = 'Kilograms'
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement