I have a dataframe with two columns. The power column represents the power consumption of the system. And the component_status column divide the data in two, based when the component is OFF or ON. When the values are 153 is when the component is ON and when the values are 150 the component is OFF.
The result that I am looking for is to have a boxplot with three boxplots, using sns.boxplot. One is the power consumption with all the data, called “TOTAL”. The other two, the power consumption based if the component was OFF or ON, called “COMPONENT = ON” “COMPONENT = OFF”.
The data frame example is as follows:
power|component_status | 0.5 | 150 | 1.5 | 150 | 2.5 | 150 | 0.3 | 153 | 0.5 | 153 | 1.5 | 153 | 2.5 | 150 | 0.3 | 153 |
thanks for the help.
Advertisement
Answer
Your first step is to build your data frame with the conditions. There are a few ways to go about this.
- Let’s start with an initial df1(dataframe #1) as you have given. Then, let’s add aconditioncolumn to say “Total”. You can useprint(df1)to see what this looks like.
- Then let’s copy that dataframe into df2, and let’s replace theconditionswith the off/on criteria from thecomponent_status.
- Our final dataframe dfis just a concatenation ofdf1anddf2.
- Now we have a dataframe dfthat is ready to go in Seaborn.
Imports and DataFrame
# Set up
import pandas as pd
import numpy as np
import seaborn as sns
power = [0.5, 1.5, 2.5, 0.3, 0.5, 1.5, 2.5, 0.3]
component_status = [150, 150, 150, 153, 153, 153, 150, 153]
df1 = pd.DataFrame(
    data=zip(power, component_status), columns=["power", "component_status"]
)
# Step 1
df1["condition"] = "Total"
# print(df1)
# Step 2
df2 = df1.copy()
df2["condition"] = np.where(df2["component_status"] == 153, "On", "Off")
# If you have several criteria, it can be easier to use np.select
# ... or just use Pandas directly:
# df2.loc[(df2['component_status'] == 153), 'condition'] = 'On'
# df2.loc[(df2['component_status'] == 150), 'condition'] = 'Off'
### Step 3
df = pd.concat([df1,df2])
df view
power component_status condition 0 0.5 150 Total 1 1.5 150 Total 2 2.5 150 Total 3 0.3 153 Total 4 0.5 153 Total 5 1.5 153 Total 6 2.5 150 Total 7 0.3 153 Total 0 0.5 150 Off 1 1.5 150 Off 2 2.5 150 Off 3 0.3 153 On 4 0.5 153 On 5 1.5 153 On 6 2.5 150 Off 7 0.3 153 On
Plotting
# Step 4 ax = sns.boxplot(data=df, x='condition', y='power')
 
						