Skip to content
Advertisement

Principal Component Analysis (PCA) in Python

I have a (26424 x 144) array and I want to perform PCA over it using Python. However, there is no particular place on the web that explains about how to achieve this task (There are some sites which just do PCA according to their own – there is no generalized way of doing so that I can find). Anybody with any sort of help will do great.

Advertisement

Answer

You can find a PCA function in the matplotlib module:

import numpy as np
from matplotlib.mlab import PCA

data = np.array(np.random.randint(10,size=(10,3)))
results = PCA(data)

results will store the various parameters of the PCA. It is from the mlab part of matplotlib, which is the compatibility layer with the MATLAB syntax

EDIT: on the blog nextgenetics I found a wonderful demonstration of how to perform and display a PCA with the matplotlib mlab module, have fun and check that blog!

Advertisement