Sklearn PCA explained variance and explained variance ratio difference

Question

I'm trying to get the variances from the eigen vectors. What is the difference between explained_variance_ratio_ and explained_variance_ in PCA? Answer The percentage of the explained variance is: The variance i.e. the eigenvalues of the covariance matrix is: Formula: explained_variance_ratio_ = explained_variance_ / np.sum(explained_variance_) Example: Also based on the above formula: 7.93954312 / (7.93954312+ 0.06045688) = 0.99244289 From the documentation:

Accepted Answer

The percentage of the explained variance is:explained_variance_ratio_The variance i.e. the eigenvalues of the covariance matrix is:explained_variance_Formula:explained_variance_ratio_ = explained_variance_ / np.sum(explained_variance_)Example:import numpy as npfrom sklearn.decomposition import PCAX = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])pca = PCA(n_components=2)pca.fit(X)  pca.explained_variance_array([7.93954312, 0.06045688]) # the actual eigenvalues (variance)pca.explained_variance_ratio_ # the percentage of the variancearray([0.99244289, 0.00755711])Also based on the above formula:7.93954312 / (7.93954312+ 0.06045688) = 0.99244289From the documentation:  explained_variance_ : array, shape (n_components,) The amount of  variance explained by each of the selected components.    Equal to n_components largest eigenvalues of the covariance matrix of  X.    New in version 0.18.    explained_variance_ratio_ : array, shape (n_components,) Percentage of  variance explained by each of the selected components.    If n_components is not set then all components are stored and the sum  of the ratios is equal to 1.0.

Advertisement

Answer