Skip to content
Advertisement

Extract Value From Pandas Dataframe Based On Condition in Another Column

I am trying to develop some code that extracts the power price when a power plant starts up. To give an example refer to the following data frame.

data = {
  'Power_Price':  [10, 11,15, 33, 50, 10, 12, 20, 17],
  'Plant_Ops_1': [0, 0, 10, 10, 10, 0, 0, 10, 10],
  'Plant_Ops_2': [0, 0, 0, 50, 50, 0, 0, 0, 0]
}

df = pd.DataFrame (data, columns = ['Power_Price','Plant_Ops_1','Plant_Ops_2'])

Based on this I aiming to develop some code that would store in a dataframe the power price when the plant ops columns transitions from 0 to a number greater than 0 (i.e. when the power plant starts). In the case of the data above the output would look something along the lines of:

data_out = {
  'Plant': ['Plant_Ops_1', 'Plant_Ops_1', 'Plant_Ops_2'],
  'Power_price': [15, 20, 33]
}

df_out = pd.DataFrame (data_out, columns = ['Plant','Power_price'])

Hopefully this makes sense. Certainly welcome any advice or guidance you are able to provide.

Advertisement

Answer

Use DataFrame.melt with filter rows with shifted per groups equal 0 and also greater like 0 in boolean indexing:

df = df.melt('Power_Price', var_name='Plant')

df = df[df.groupby('Plant')['value'].shift().eq(0) & df['value'].gt(0)].drop('value',axis=1)
print (df)
    Power_Price        Plant
2            15  Plant_Ops_1
7            20  Plant_Ops_1
12           33  Plant_Ops_2

Last if necessary change order of columns:

df = df[["Plant", "Power_Price"]]
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement