I am attempting to apply the sympy.solve function to each row in a DataFrame using each column a different variable. I have managed to solve for the undefined variable I need to calculate — λ — using the following code:
from sympy import symbols, Eq, solve import math Tmin = 33.2067 Tmax = 42.606 D = 19.5526 tmin = 6 tmax = 14 pi = math.pi λ = symbols('λ') lhs = D eq1 = (((Tmax-Tmin)/2) * (λ-tmin) * (1-(2/pi))) + (((Tmax-Tmin)/2)*(tmax-λ)*(1+(2/pi))) - (((Tmax-Tmin)/2)*(tmax-tmin)) eq1 = Eq(lhs, rhs) lam = solve(eq1) print(lam)
However, I need to apply this function to every row in a DataFrame and output the result as its own column. The DataFrame is formatted as follows:
import pandas as pd data = [[6, 14, 33.2067, 42.606, 19.5526], [6, 14, 33.4885, 43.0318, -27.9222]] df = pd.DataFrame(data, columns=['tmin', 'tmax', 'Tmin', 'Tmax', 'D'])
I have searched for how to do this, but am not sure how to proceed. I managed to find similar questions wherein the answers discussed lambdifying the equation, but I wasn’t sure how to lambdify a two-sided equation and then apply it to my DataFrame, and my math skills aren’t strong enough to isolate λ and place it on the left side of the equation so that I don’t have to lambdify a two-sided equation. Any help here would be appreciated.
Advertisement
Answer
You can use the apply
method of a dataframe in order to apply a numerical function to each row. But first we need to create the numerical function: we are going to do that with lambdify
:
from sympy import symbols, Eq, solve, pi, lambdify import pandas as pd # create the necessary symbols Tmin, Tmax, D, tmin, tmax = symbols("T_min, T_max, D, t_min, t_max") λ = symbols('λ') # create the equation and solve for λ lhs = D rhs = (((Tmax-Tmin)/2) * (λ-tmin) * (1-(2/pi))) + (((Tmax-Tmin)/2)*(tmax-λ)*(1+(2/pi))) - (((Tmax-Tmin)/2)*(tmax-tmin)) eq1 = Eq(lhs, rhs) # solve eq1 for λ: take the first (and only) solution λ_expr = solve(eq1, λ)[0] print(λ_expr) # out: (-pi*D + T_max*t_max + T_max*t_min - T_min*t_max - T_min*t_min)/(2*(T_max - T_min)) # convert λ_expr to a numerical function so that it can # be quickly evaluated. Essentialy, creates: # λ_func(tmin, tmax, Tmin, Tmax, D) # NOTE: for simplicity, let's order the symbols like the # columns of the dataframe λ_func = lambdify([tmin, tmax, Tmin, Tmax, D], λ_expr) # We are going to use df.apply to apply the function to each row # of the dataframe. However, pandas will pass into the current row # as the argument. For example: # row = [val_tmin, val_tmax, val_Tmin, val_Tmax, val_D] # Hence, we need a wrapper function to unpack the row to the # arguments required by λ_func wrapper_func = lambda row: λ_func(*row) # create the dataframe data = [[6, 14, 33.2067, 42.606, 19.5526], [6, 14, 33.4885, 43.0318, -27.9222]] df = pd.DataFrame(data, columns=['tmin', 'tmax', 'Tmin', 'Tmax', 'D']) # apply the function to each row print(df.apply(wrapper_func, axis=1)) # 0 6.732400044759731 # 1 14.595903848357743 # dtype: float64