Skip to content
Advertisement

How to generate accurate masks for an image from Mask R-CNN prediction in PyTorch?

I have trained a Mask RCNN network for instance segmentation of apples. I am able to load the weights and generate predictions for my test images. The masks being generated seem to be in the correct location, but the mask itself has no real form.. it just looks like a bunch of pixels

Training is done based on the dataset from this paper, and here is the github link to code being used to train and generate weights

code for prediction is as follows. (i have omitted the parts where i create path variables and assign the paths)

JavaScript

Here is an Imgur link to the original image.. below is the predicted mask for one of the instances

Mask output for one instance

Also, could you please help me understand the data structure of the generated prediction matrix shown below.. How do i access the masks so as to generate a single image with all masks displayed???

JavaScript

Advertisement

Answer

The prediction from the Mask R-CNN has the following structure:

During inference, the model requires only the input tensors, and returns the post-processed predictions as a List[Dict[Tensor]], one for each input image. The fields of the Dict are as follows:

JavaScript

You can use OpenCV’s findContours and drawContours functions to draw masks as follows:

JavaScript

Sample output:

sample output

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement