Skip to content
Advertisement

I’m trying to get a concatenated Pandas dataframe that is the result of the calculated mean of several columns

As mentioned above, I’m trying to get the mean of several columns then concatenate the resulting dataframes into a new dataframe. I’m getting the following error:

FutureWarning: Dropping of nuisance columns in DataFrame reductions (with ‘numeric_only=None’) is deprecated; in a future version this will raise TypeError. Select only valid columns before calling the reduction. summaryData[‘aver_51’] = summaryData[[“5.1.2 Hello World Quiz”,

Here is the code:

import pandas as pd

dataIn = pd.read_excel('IDT/IDT_A.xlsx')

dataExtract = dataIn.filter([
    "3.2.6 Programming with Karel Quiz",
    "5.1.2 Hello World Quiz",
    "5.1.4 Your Name and Hobby",
    "5.2.2 Variables Quiz",
    "5.2.4 Daily Activities",
    "5.3.2 User Input Quiz",
    "5.3.4 Dinner Plans",
    "5.4.2 Basic Math in JavaScript Quiz",
    "5.4.6 T-Shirt Shop",
    "5.4.7 Running Speed",
    "5.5.2 JavaScript Graphics Quiz",
    "5.5.8 Flag of the Netherlands",
    "5.5.9 Snowman",
    "5.6.2 Using RGB to Create Colors",
    "5.6.4 Exploring RGB",
    "5.6.5 Making Yellow",
    "5.6.6 Rainbow",
    "5.6.7 Create a Color Image!",
    "6.1.1 Ghost",
    "6.1.2 Fried Egg",
    "6.1.3 Draw Something",
    "6.1.4 JavaScript and Graphics Quiz"

], axis=1)

summaryData = dataExtract.copy()

summaryData['aver_51'] = summaryData[["5.1.2 Hello World Quiz",
                                      "5.1.4 Your Name and Hobby"]].mean(axis=1)

summaryData['aver_52'] = summaryData[["5.2.2 Variables Quiz",
                                      "5.2.4 Daily Activities"]].mean(axis=1)

summaryData['aver_53'] = summaryData[["5.3.2 User Input Quiz",
                                      "5.3.4 Dinner Plans"]].mean(axis=1)

summaryData['aver_54'] = summaryData[["5.4.2 Basic Math in JavaScript Quiz",
                                      "5.4.6 T-Shirt Shop",
                                      "5.4.7 Running Speed"]].mean(axis=1)

summaryData['aver_55'] = summaryData[["5.5.2 JavaScript Graphics Quiz",
                                      "5.5.8 Flag of the Netherlands",
                                      "5.5.9 Snowman"]].mean(axis=1)

summaryData['aver_56'] = summaryData[["5.6.2 Using RGB to Create Colors",
                                      "5.6.4 Exploring RGB",
                                      "5.6.5 Making Yellow",
                                      "5.6.6 Rainbow",
                                      "5.6.7 Create a Color Image!"]].mean(axis=1)

summaryData['aver_61'] = summaryData[["6.1.1 Ghost",
                                      "6.1.2 Fried Egg",
                                      "6.1.3 Draw Something",
                                      "6.1.4 JavaScript and Graphics Quiz"]].mean(axis=1)

finalData = pd.concat([summaryData['aver_53'], summaryData['aver_54'], summaryData['aver_55'],
                       summaryData['aver_56'], summaryData['aver_61']], axis=1)

finalData.to_excel('output/gradesOut.xlsx')

Advertisement

Answer

“Nuisance columns” are actually just columns that pandas can’t process in the current operation (e.g., strings); in this case, mean. You’ll have to get rid of all the columns/cells that contain strings before you can compute the mean.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement