My goal is to transform a video into a 2D matrix X, where the column vectors represent a frame. So the matrix has the dimension: X.shape —> (# features of a frame, # total number of frames)
I need this form because I want to apply different ML algorithms on X. To get X I proceed as follows:
upload the video in python with the OpenCV library and save all frames.
Loop{
a) Frame (=3D array with dimensions height, width, depth=3 rbg) is converted into a 1D vector x b) Append vector x to Matrix X }
For step 2 b) I use
video_matrix = np.column_stack((video_matrix, frame_vector))
This operation takes about 0.5s for a 640×320 frame. For a small video with 3min (8000 frames) the calculation of X takes almost 150 minutes. Is there a way to make it faster?
Code for the first part:
video = cv2.VideoCapture('path/video.mp4') if not os.path.exists('data'): os.makedirs('data') counter = 0 while(True): # reading from frame ret,frame = video.read() if ret: # if video is still left continue creating images name = './data/frame' + str(counter) + '.jpg' #print ('Creating...' + name) # writing the extracted images cv2.imwrite(name, frame) # increasing counter so that it will # show how many frames are created counter += 1 else: break # Release all space and windows once done video.release() cv2.destroyAllWindows()
And the second part which is to slow
video_matrix = np.zeros(width * height * 3) # initialize 1D array which will become the 2D array; first column will be deleted at the end for i in range(counter): # loops over the total amount of frames current_frame = np.asarray(Image.open('./data/frame'+str(i)+'.jpg')) # 3D-array = current frame frame_vector = image_to_vector(current_frame) #convert frame into a 1D array video_matrix = np.column_stack((video_matrix, frame_vector)) # append frame x to a matrix X that will represent the video video_matrix = np.delete(video_matrix, 0, 1) # delete the initialized zero column
Advertisement
Answer
do not repeatedly append single frames to your accumulated data. that’ll cost you O(n^2), i.e. the program will run ever slower the more it has to read. numpy can’t enlarge arrays in-place. it has to create a copy every time. the copying effort increases with every additional frame.
append each frame to a python list. when you’re done reading the video, convert the whole list into a numpy array once.