Skip to content
Advertisement

Represent a video as a 2D Array where each column represents a frame – OpenCV and Python

My goal is to transform a video into a 2D matrix X, where the column vectors represent a frame. So the matrix has the dimension: X.shape —> (# features of a frame, # total number of frames)

I need this form because I want to apply different ML algorithms on X. To get X I proceed as follows:

  1. upload the video in python with the OpenCV library and save all frames.

  2. Loop{

JavaScript

For step 2 b) I use

JavaScript

This operation takes about 0.5s for a 640×320 frame. For a small video with 3min (8000 frames) the calculation of X takes almost 150 minutes. Is there a way to make it faster?

Code for the first part:

JavaScript

And the second part which is to slow

JavaScript

Advertisement

Answer

do not repeatedly append single frames to your accumulated data. that’ll cost you O(n^2), i.e. the program will run ever slower the more it has to read. numpy can’t enlarge arrays in-place. it has to create a copy every time. the copying effort increases with every additional frame.

append each frame to a python list. when you’re done reading the video, convert the whole list into a numpy array once.

Advertisement