My dataFrame has the following column, which shows pressure and corresponding volume measured for different samples, e.g. s_1p
: pressure for sample-1 & s1_nv
: corresponding volume for the same sample. I want to show all volume columns on the x-axis and pressure on the y-axis of the same plot (not sub-plot) and legend labelled as the sample number.
df= s1_p s1_nv s9_p s9_nv s21_p s21_nv s26_p s26_nv s32_p s32_nv s37_p s37_nv s49_p s49_nv s52_p s52_nv s105_p s105_nv s118_p s118_nv 0 0.977966 0.000544 0.928902 0.000000 1.140129 0.000000 1.002083 0.000000 0.958008 0.000000 1.301460 0.000000 0.964661 0.000000 0.976303 0.001193 1.002914 0.000246 1.008736 0.000129 1 1.022041 0.001087 0.953850 0.000000 1.175056 0.000153 1.079422 0.000208 0.980461 0.001955 1.328903 0.000000 0.986282 0.000000 1.004578 0.003279 1.034515 0.000246 1.038673 0.000385 2 1.050316 0.001268 0.984619 0.000000 1.204163 0.000153 1.140961 0.000208 1.012062 0.002557 1.357178 0.000000 1.015388 0.000125 1.031189 0.004621 1.056137 0.000246 1.061127 0.000513 3 1.082748 0.001268 1.010399 0.000261 1.224953 0.000153 1.249901 0.000208 1.029526 0.002557 1.382958 0.000191 1.033684 0.000125 1.062790 0.004770 1.085243 0.000493 1.094391 0.000513 4 1.109360 0.001268 1.031189 0.000261 1.247406 0.000153 1.314766 0.000208 1.075264 0.003159 1.407074 0.000381 1.066948 0.000125 1.097717 0.004770 1.136803 0.000493 1.130981 0.000513 5 1.127655 0.001268 1.056969 0.000261 1.277344 0.000306 1.459465 0.000417 1.130150 0.003460 1.446159 0.000381 1.113518 0.000250 1.138466 0.004919 1.160919 0.000739 1.149277 0.000641 6 1.160087 0.001268 1.086075 0.000261 1.302292 0.000459 1.629112 0.000624 1.150108 0.003610 1.472771 0.000381 1.140129 0.000250 1.160088 0.005068 1.225784 0.000739 1.177551 0.000898 7 1.209152 0.001268 1.117676 0.000392 1.328072 0.000459 1.658218 0.000624 1.171730 0.003911 1.514351 0.000571 1.209984 0.000250 1.212479 0.005217 1.293144 0.000739 1.247406 0.000898 8 1.259048 0.001268 1.151772 0.000392 1.370483 0.000612 1.748863 0.000624 1.249069 0.005114 1.555100 0.000571 1.278175 0.000250 1.270691 0.005217 1.372978 0.000739 1.310608 0.000898 9 1.283165 0.001268 1.180878 0.000392 1.399590 0.000612 1.920174 0.000624 1.290649 0.005415 1.575890 0.000571 1.297302 0.000375 1.379631 0.005217 1.420380 0.000986 1.334724 0.000898 10 1.362167 0.001268 1.227448 0.000392 1.426201 0.000612 2.064041 0.000833 1.333893 0.005716 1.602501 0.000761 1.351357 0.000500 1.466949 0.005217 1.592522 0.001232 1.507698 0.001283 11 1.446991 0.001449 1.278175 0.000392 1.475266 0.000612 2.252815 0.000833 1.434517 0.006919 1.635765 0.000761 1.385452 0.000500 1.636597 0.005664 1.757179 0.001232 1.666534 0.001796 12 1.473602 0.001630 1.297302 0.000522 1.541794 0.000765 2.432442 0.000833 1.603333 0.010077 1.698967 0.000761 1.518509 0.000625 1.802917 0.005664 1.778801 0.001726 1.698967 0.001796 13 1.667366 0.001630 1.316429 0.000522 1.639923 0.000765 2.614563 0.000833 1.626617 0.010077 1.790444 0.000761 1.693977 0.000750 1.840340 0.005664 1.800423 0.002218 1.870277 0.002181 14 1.837845 0.001630 1.344704 0.000652 1.712273 0.000919 2.812485 0.000833 1.809570 0.010679 1.828697 0.000761 1.715599 0.000750 1.972565 0.006111 1.988365 0.002958 2.044083 0.002181 15 2.042419 0.001630 1.412063 0.000783 1.861130 0.000919 2.984627 0.000833 1.831192 0.010679 1.856972 0.000761 1.876098 0.000750 2.142212 0.006410 2.167160 0.002958 2.083168 0.002438 16 2.222878 0.001630 1.476929 0.000783 2.029114 0.001531 3.014565 0.001041 2.003334 0.011732 1.964249 0.000951 2.058220 0.001000 2.173813 0.006410 2.209572 0.003204 2.250320 0.002566 17 2.256142 0.001630 1.497719 0.000913 2.052398 0.001531 3.169243 0.001041 2.026619 0.011882 2.134727 0.000951 2.265290 0.001125 2.325165 0.006708 2.385040 0.003451 2.417473 0.002695 18 2.422463 0.001630 1.672356 0.001305 2.163834 0.001531 3.354691 0.001041 2.198761 0.013687 2.299385 0.001142 2.439095 0.001125 2.495644 0.007005 2.556351 0.003697 2.449905 0.002695
When I used the following code, it does the job.
S1_P=df['s1_p'] S1_V=df['s1_nv'] #(similarly for other samples) plt.plot(S1_P, S1_V, color='r', label='S1') plt.plot(S9_P, S9_V, color='g', label='S9') plt.plot(S21_P, S21_V, color='g', label='S21')
But problem is that I have to call all the individual columns as a series and then again and again for the plot.
df.plot(x=["s1_p", 's9_p', 's21_p'] y=["s1_v", 's9_v', 's21_v']) showed error.
I want to automate the process so that I don’t have to call each individual column for the plot.
Any suggestion to plot the data in a single plot using seaborn or matplotlib
Advertisement
Answer
Starting from the dataframe you provided, the simplest way I am aware of drawing the plot you want is re-shape the dataframe in a proper way and then plot it.
Dataframe re-shaping
You need to re-shape your data in a dataframe with 3 columns: sample
, pressure
and volume
. In order to do so, I save data in a new dataframe DF
:
samples = list(set([col.replace('s', '').replace('_p', '').replace('_nv', '') for col in df.columns])) DF = pd.DataFrame(columns = ['sample', 'pressure', 'volume']) for sample in samples: df_tmp = pd.DataFrame() for col in df.columns: if f's{sample}_' in col: df_tmp['sample'] = len(df[col])*[sample] if col.endswith('p'): df_tmp['pressure'] = df[col] else: df_tmp['volume'] = df[col] DF = DF.append(df_tmp) DF['sample'] = DF['sample'].astype(int) DF = DF.sort_values(by = 'sample', ignore_index = True) DF['sample'] = DF['sample'].astype(str)
sample pressure volume 0 1 1.127655 0.001268 1 1 0.977966 0.000544 2 1 1.022041 0.001087 3 1 1.050316 0.001268 4 1 1.082748 0.001268 5 1 1.109360 0.001268 6 1 1.160087 0.001268 7 1 1.209152 0.001268 8 1 1.283165 0.001268 9 1 1.259048 0.001268
Complete Code
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns df = pd.read_csv(r'data/data.csv') samples = list(set([col.replace('s', '').replace('_p', '').replace('_nv', '') for col in df.columns])) DF = pd.DataFrame(columns = ['sample', 'pressure', 'volume']) for sample in samples: df_tmp = pd.DataFrame() for col in df.columns: if f's{sample}_' in col: df_tmp['sample'] = len(df[col])*[sample] if col.endswith('p'): df_tmp['pressure'] = df[col] else: df_tmp['volume'] = df[col] DF = DF.append(df_tmp) DF['sample'] = DF['sample'].astype(int) DF = DF.sort_values(by = 'sample', ignore_index = True) DF['sample'] = DF['sample'].astype(str) fig, ax = plt.subplots() sns.scatterplot(ax = ax, data = DF, x = 'volume', y = 'pressure', hue = 'sample') plt.show()
Plot
Now you can plot your data, for example you can use seaborn.scatterplot
:
fig, ax = plt.subplots() sns.scatterplot(ax = ax, data = DF, x = 'volume', y = 'pressure', hue = 'sample') plt.show()