I am encountering an issue regarding the sorting my features by their value. I would like to see my image with bars getting shorter based on how high they are on the y-axis. Unfortunately, my barplot looks like this, with the features being sorted alphabetically:
Right now I am running the following code:
unsorted_list = [(importance, feature) for feature, importance in zip(features, importances)] sorted_list = sorted(unsorted_list) features_sorted = [] importance_sorted = [] for i in sorted_list: features_sorted += [i[1]] importance_sorted += [i[0]] plt.title("Feature importance", fontsize=15) plt.xlabel("Importance", fontsize=13) plt.barh(features_sorted,importance_sorted, color="green", edgecolor='green') # plt.savefig('importance_barh.png', dpi=100)
Here is the data going through there:
unsorted_list = [('HR', 0.28804817462980353), ('BR', 0.04062328177704225), ('Posture', 0.09011618483921582), ('Activity', 0.0017821837085763366), ('PeakAccel', 0.002649111136700579), ('HRV', 0.13598729040097057), ('ROGState', 0.014534726412631642), ('ROGTime', 0.22986192060475388), ('VerticalMin', 0.016099772399198357), ('VerticalPeak', 0.012697214182994502), ('LateralMin', 0.029479112475744584), ('LateralPeak', 0.022745210003295983), ('SagittalMin', 0.08653071485979484), ('SagittalPeak', 0.028845102569277088)] sorted_list = [(0.0017821837085763366, 'Activity'), (0.002649111136700579, 'PeakAccel'), (0.012697214182994502, 'VerticalPeak'), (0.014534726412631642, 'ROGState'), (0.016099772399198357, 'VerticalMin'), (0.022745210003295983, 'LateralPeak'), (0.028845102569277088, 'SagittalPeak'), (0.029479112475744584, 'LateralMin'), (0.04062328177704225, 'BR'), (0.08653071485979484, 'SagittalMin'), (0.09011618483921582, 'Posture'), (0.13598729040097057, 'HRV'), (0.22986192060475388, 'ROGTime'), (0.28804817462980353, 'HR')]
I recently upgraded to matplotlib 3.0.2
Advertisement
Answer
EDIT (based on the comments)
Your code works fine on matplotlib 2.2.2
and the issue seems to be with your list naming convention and some confusion among them. It will work as expected on 3.0.2. Nevertheless, you might be interested in knowing the workaround
features_sorted = [] importance_sorted = [] for i in sorted_list: features_sorted += [i[1]] importance_sorted += [i[0]] plt.title("Feature importance", fontsize=15) plt.xlabel("Importance", fontsize=13) plt.barh(range(len(importance_sorted)), importance_sorted, color="green", edgecolor='green') plt.yticks(range(len(importance_sorted)), features_sorted);
Alternative suggested by @tmdavison
plt.barh(range(len(importance_sorted)), importance_sorted, color="green", edgecolor='green', tick_label=features_sorted)