I have a correlation matrix (in the form of a DataFrame) from which I return a Series which is the top n correlated pairs of columns and the value of the correlation:
HCT HGB 0.928873 ALT AST 0.920744 MCH MCV 0.861742 bpsys bpdia 0.846069 HCT RBC 0.769507 HGB RBC 0.697879 gender_Male 0.690716 CL SODIUM 0.688227 LYM WBC 0.672971 RBC gender_Male 0.663275 HCT gender_Male 0.660515 MCH MCHC 0.571524 age HGB 0.512578 HGB MCHC 0.506935 age gender_Male 0.493219 dtype: float64
See this for an example of what I mean. I take the resulting Series object and then cast as a dictionary like so:
top_corrs = top_corrs.to_dict()
The resulting keys of this dictionary are tuples of the top n correlated variables, which I found by:
top_corrs.keys()
Resulting in 15 keys:
dict_keys([('HCT', 'HGB'), ('ALT', 'AST'), ('MCH', 'MCV'), ('bpsys', 'bpdia'), ('HCT', 'RBC'), ('HGB', 'RBC'), ('HGB', 'gender_Male'), ('CL', 'SODIUM'), ('LYM', 'WBC'), ('RBC', 'gender_Male'), ('HCT', 'gender_Male'), ('MCH', 'MCHC'), ('age', 'HGB'), ('HGB', 'MCHC'), ('age', 'gender_Male')])
Now, what I would like to do is go back to my original DataFrame from where I calculated the correlations and plot these tuples of columns against one another looping through the dictionary keys.
Kind of like this:
Key1 = (‘HCT’, ‘HGB’)
sns.lmplot(y = ‘HCT’ x=’HGB’, data= originaldata, hue=huevar, col=colvar, palette='Set1')
Key2 = (‘ALT’, ‘AST’)
sns.lmplot(y = ‘HCT’ x=’HGB’, data= originaldata, hue=huevar, col=colvar, palette='Set1')
In a sense I want to “unpack” (don’t know if I am using that word right) these tuples and plot them against each other?
Is this possible, or am I just trippin’?
Thanks in avbance
Advertisement
Answer
You can iterate over dictionary keys in a for loop and then subset each element within the tuple:
for key in top_corrs: sns.lmplot(y = key[0], x=key[1], data=originaldata, palette='Set1')