I have a pearson correlation heat map coded, but its showing data from my dataframe which i dont need.
is there a way to specify which columns i’d like to include?
thanks in advance
sb.heatmap(df['POPDEN', 'RoadsArea', 'MedianIncome', 'MedianPrice', 'PropertyCount', 'AvPTAI2015', 'PTAL'].corr(), annot=True, fmt='.2f') --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-54-832fc3c86e3e> in <module> ----> 1 sb.heatmap(df['POPDEN', 'RoadsArea', 'MedianIncome', 'MedianPrice', 'PropertyCount', 'AvPTAI2015', 'PTAL'].corr(), annot=True, fmt='.2f') TypeError: list indices must be integers or slices, not tuple
df.cov().round(3) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-79-34a86e96b161> in <module> ----> 1 df.cov().round(3) TypeError: cov() missing 1 required positional argument: 'self'
Advertisement
Answer
You can filter the dataframe before calculating correlation
sns.heatmap(df[['POPDEN', 'RoadsArea', 'MedianIncome', 'MedianPrice', 'PropertyCount', 'AvPTAI2015', 'PTAL']].corr(), annot=True, fmt='.2f')