I have this function
def highest_correlation(dataframe): corr_table = dataframe.corr().unstack() df_corrvalues = corr_table.sort_values(ascending=False) return df_corrvalues correlation = highest_correlation(heart) correlation
This is the output
age age 1.000000 sex sex 1.000000 thall thall 1.000000 caa caa 1.000000 slp slp 1.000000 ... output oldpeak -0.429146 exng -0.435601 exng output -0.435601 slp oldpeak -0.576314 oldpeak slp -0.576314 Length: 196, dtype: float64
How can return the highest correlation values that are lower than 1?
That is, I want to remove the 1s that appear on the top when I use sort_values(ascending=False)
Advertisement
Answer
Multiindex Series from the Pandas User Guide
import pandas as pd from numpy.random import default_rng rng = default_rng() arrays = [ ["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"], ["one", "two", "one", "two", "one", "two", "one", "two"], ] tuples = list(zip(*arrays)) index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"]) s = pd.Series(rng.standard_normal(8), index=index) print(s)
Filter for values less than one.
print(s[s<1])
first second bar one 1.602675 two -0.197277 baz one -0.746729 two 1.384208 foo one 1.587294 two -1.616769 qux one -0.872030 two -0.721226 dtype: float64 first second bar two -0.197277 baz one -0.746729 foo two -1.616769 qux one -0.872030 two -0.721226 dtype: float64