I am using python to do some basic image processing, and want to extend it to process a video frame by frame.
I get the video as a blob from a server – .webm encoded – and have it in python as a byte string (b'x1aExdfxa3xa3Bx86x81x01Bxf7x81x01Bxf2x81x04Bxf3x81x08Bx82x88matroskaBx87x81x04Bx85x81x02x18Sx80gx01xffxffxffxffxffxffxffx15Ixa9fx99*xd7xb1x83x0fB@Mx80x86ChromeWAx86Chromex16Txaekxadxaexabxd7x81x01sxc5x87x04xe8xfcx16t^x8cx83x81x01x86x8fV_MPEG4/ISO/AVCxe0x88xb0x82x02x80xbax82x01xe0x1fCxb6ux01xffxffxffxffxffxff ...
).
I know that there is cv.VideoCapture
, which can do almost what I need. The problem is that I would have to first write the file to disk, and then load it again. It seems much cleaner to wrap the string, e.g., into an IOStream, and feed it to some function that does the decoding.
Is there a clean way to do this in python, or is writing to disk and loading it again the way to go?
Advertisement
Answer
Two years after Rotem wrote his answer there is now a cleaner / easier way to do this using ImageIO.
Note: Assuming ffmpeg
is in your path, you can generate a test video to try this example using: ffmpeg -f lavfi -i testsrc=duration=10:size=1280x720:rate=30 testsrc.webm
import imageio.v3 as iio from pathlib import Path webm_bytes = Path("testsrc.webm").read_bytes() # read all frames from the bytes string frames = iio.imread(webm_bytes, index=None, format_hint=".webm") frames.shape # Output: # (300, 720, 1280, 3) for frame in iio.imiter(webm_bytes, format_hint=".webm"): print(frame.shape) # Output: # (720, 1280, 3) # (720, 1280, 3) # (720, 1280, 3) # ...
To use this you’ll need the ffmpeg backend (which implements a solution similar to what Rotem proposed): pip install imageio[ffmpeg]
In response to Rotem’s comment a bit of explanation:
The above snippet uses imageio==2.16.0
. The v3 API is an upcoming user-facing API that streamlines reading and writing. The API is available since imageio==2.10.0
, however, you will have to use import imageio as iio
and use iio.v3.imiter
and iio.v3.imread
on versions older than 2.16.0.
The ability to read video bytes has existed forever (>5 years and counting) but has (as I am just now realizing) never been documented directly … so I will add a PR for that soon™ :)
On older versions (tested on v2.9.0) of ImageIO (v2 API) you can still read video byte strings; however, this is slightly more verbose:
import imageio as iio import numpy as np from pathlib import Path webm_bytes = Path("testsrc.webm").read_bytes() # read all frames from the bytes string frames = np.stack(iio.mimread(webm_bytes, format="FFMPEG", memtest=False)) # iterate over frames one by one reader = iio.get_reader(webm_bytes, format="FFMPEG") for frame in reader: print(frame.shape) reader.close()