automatic mean of multiple columns in python

Question

I have a dataset with multiple variables. I am trying to group these variables according to the end of the name of variable and calculate the mean of each group. Here is an example of my dataset: What I am trying to do is to group the variables that ends with the same number, e.g.: [AST_0-01, AST_1-01, AST_2-01], [AST_0-02, AST_1-02,

Accepted Answer

First, &#8220;transpose&#8221; your dataframe so that you can group by the string namesIn [3]: df = df.T.reset_index()In [4]: dfOut[4]:      index  0  1  20  AST_0-01  1  2  31  AST_0-02  4  5  62  AST_1-01  7  8  93  AST_1-02  1  2  34  AST_2-01  4  5  65  AST_2-02  7  8  9In [5]: df.groupby(df["index"].str[-2:]).mean()Out[5]:         0    1    2index01     4.0  5.0  6.002     4.0  5.0  6.0This mean is broken out into the three separate rows in the original dataframe, but if you want the &#8220;total&#8221; mean, thenIn [6]: df.groupby(df["index"].str[-2:]).mean().sum(axis=1)Out[6]:index01    15.002    15.0dtype: float64

Advertisement

Answer