# PyTorch implementation of the Mask-X-RCNN network proposed

## PyTorch-mask-x-rcnn

PyTorch implementation of the Mask-X-RCNN network proposed in the 'Learning to Segment Everything' paper by Facebook AI Research.

The paper is about Instance Segmentation given a huge dataset with only bounding box and a small dataset with both bbox and segmentation ground truths. It follows the semi-supervised learning paradigm. The base architecture is same as that of Mask-RCNN.

## Model Architecture

- The pipeline is as shown in the Figure. For little more explanation checkout this blog post (last section).
- Backproping both losses will induce a discrepancy in the weights of
`w_seg`

as for common classes between COCO and VG there are two losses (bbox and mask) while for rest classes its only one (bbox). There's a fix for this- Fix: When back-propping the mask, compute the gradient of predicted mask weights (
`w_seg`

) wrt**weight transfer function**parameters $\theta$ but not bounding box weight $w_{det}^c$ . - where
`tau`

is the transfer function.

- Fix: When back-propping the mask, compute the gradient of predicted mask weights (

## Implementation Details

- The model is based on the Mask-RCNN implementation from here. Thanks to him and original Keras version on which its based on! Integrate it with the pipeline from the repo to train the network!
- Modules added
`transfer_function`

in`fpn_classifier_graph`

`cls`

,`box`

,`cls+box`

choices for the detection weights in`fpn_classifier_graph`

`class-agnostic`

(baseline) and`transfer`

(above diagram) modes for the Mask branch as explained in the paper.- Optional
`MLP fusion`

(class agnostic MLP) as explained in Section 3.4 of the paper. `stop_grad`

for backpropping mask loss (keeping`w_det`

out of gradient calculation)

## Results

- I'm planning to run it on VOC+COCO soon. Will update once it's done.
- Note - The official Detectron (Caffe2) models and code are up here

## References

```
Hu, Ronghang, Piotr Dollár, Kaiming He, Trevor Darrell and Ross B. Girshick. “Learning to Segment Every Thing.” *CoRR*abs/1711.10370 (2017): n. pag.
```