ECVA | European Computer Vision Association

Boundary-preserving Mask R-CNN

Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu ;

Abstract

Tremendous efforts have been made to improve mask localization accuracy in instance segmentation. Modern in-stance segmentation methods relying on fully convolutional networks perform pixel-wise classification, which ignores object boundaries and shapes, leading coarse and indistinct mask prediction results and imprecise localization. To remedy this, we propose a conceptually simple yet effective Boundary-guided Mask R-CNN (BMask R-CNN) to leverage object boundary information to improve mask localization accuracy. BMask R-CNN contains a boundary-preserving mask head in which object boundary and mask are mutually learned via feature fusion blocks. As a result, the mask prediction results are better aligned with object boundaries. Without bells and whistles, BMask R-CNN outperforms Mask R-CNN by a considerable margin on the COCO dataset; in the Cityscapes dataset, there are more accurate boundary groundtruths available, so that BMask R-CNN obtains remarkable improvements over Mask R-CNN. Besides, it is not surprising to observe that BMask R-CNN obtains more obvious improvement when the evaluation criterion requires better localization (e.g., AP75) as shown in Fig. 1. Code and models are available at \url{https://github.com/hustvl/BMaskR-CNN}."

Related Material

[pdf]