Skip to content
Advertisement

How can I extract numbers from video frames using Tesseract OCR?

I am interested in extracting numbers from standardized videos (always HD resolution @ 1920×1080, 30 FPS) I have. Numbers always appear in fixed sections of the screen and are never missing.

My approach would be to:

  1. Save video in frame by frame PNGs
  2. Load a single PNG frame
  3. Select the areas of interest (there are a four sections I want to
    extract numbers from; each section might need their own image manipulation; always in the exact same pixel range)
  4. Extract numbers using Python and Tesseract-OCR
  5. Store values in data frame

Examples of two of the sections are:

enter image description here

enter image description here

I have installed Python (I’m an R user) and tesseract and can run the Tesseract examples well (i.e. I have confirmed my setup works).

However, when I run the following commands on the top image [247] Tesseract is not able to extract the number, while you’d think it’s easy to extract as the text is very clear.

JavaScript

The output is:

JavaScript

Advertisement

Answer

Please use this Python code accordingly:

JavaScript

Here getText() function will take path of the png image file. After converting to HSV domain it will take the value component as v and then perform the Gaussian Blur before thresholding. You can try varying the kernel size of the dilate function accordingly to your images. The two images were given as input to the code above, and below is the output.

Output

JavaScript

Thresholding results

WYOtF.png

enter image description here enter image description here

0Oqfr.png

enter image description here enter image description here

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement