let’s say i have this code(which is obviously wrong)
def conditions(df): value = df["weight"]) unit = df["weight_unit"] if(unit.lower() == "pound"): return value / 2.2 elif(unit.lower() == "metric ton"): return value * 1000 elif(unit.lower() == "long ton"): return value * 1016 elif(unit.lower() in ("measurement ton", "short ton")): return value * 907 def convert_to_kilo(df): func = np.vectorize(conditions) to_kilo = func(df) df["weight"] = to_kilo df["weight_unit"] = "Kilograms"
I want to apply such condition to each value in a column(weights) based on another column(weight unit). Is there an efficient way to do it. Preferably allowing a func pass so easy to modify
Advertisement
Answer
Don’t use a function, this will be slow. numpy.vectorize
does not vectorize in C-speed, but rather “pseudo-vectorizes” using an internal loop.
Use map
instead:
units = {'pound': 1/2.2, 'metric ton': 1000, 'long ton': 1016, 'measurement ton': 907, 'short ton': 907, } df['weight'] *= df['weight_unit'].str.lower().map(units) df['weight_unit'] = 'Kilograms'