Mirror and Glass Detection/Segmentation |
||||
In this project, we are developing techniques for mirror and glass detection/segmentation. While a mirror is a reflective surface that reflects the scene in front of it, glass is a transparent surface that transmits the scene from the back side and often also reflects the scene in front of it too. In general, both mirrors and glass do not have their own visual appearances. They only reflect/transmit the appearances of their surroundings. As mirrors and glass do not have their own appearances, it is not straightforward to develop automatic algorithms to detect and segment them. However, as they appear everywhere in our daily life, it can be problematic if we are not able to detect them reliably. For example, a vision-based depth sensor may falsely estimate the depth of a piece of mirror/glass as the depth of the objects inside it, a robot may not be aware of the presence of a mirror/glass wall, and a drone may collide into a high rise (noted that most high rises are covered by glass these days). To the best of our knowledge, my team is the first to develop computational models for automatic detection and segmentation of mirror and transparent glass surfaces. Although there have been some works that investigate the detection of transparent glass objects, these methods mainly focus on detecting wine glass and small glass objects, which have some special visual properties that can be used for detection. Unlike these works, we are more interested in detecting general glass surfaces that may not possess any special properties of their own. We are also interested in exploring the application of our mirror/glass detection methods in autonomous navigation. |
||||
Exploiting Semantic Relations for Glass Surface Detection [paper] [model] [dataset] Jiaying Lin*, Yuen Hei Yeung*, and Rynson Lau (* joint first authors) Proc. NeurIPS, Nov. 2022 |
||||
|
||||
Input-Output: Given an input image, our network outputs a binary mask that indicates where glass surfaces are. Abstract. Glass surfaces are omnipresent in our daily lives and often go unnoticed by the majority of us. While humans are generally able to infer their locations and thus avoid collisions, it can be difficult for current object detection systems to handle them due to the transparent nature of glass surfaces. Previous methods approached the problem by extracting global context information to obtain priors such as boundary and reflection. However, their performances cannot be guaranteed when these critical features are not available. We observe that humans often reason through the semantic context of the environment, which offers insights into the categories of and proximity between entities that are expected to appear in the surrounding. For example, the odds of co-occurrence of glass windows with walls and curtains is generally higher than that with other objects such as cars and trees, which have relatively less semantic relevance. Based on this observation, we propose a model that integrates the contextual relationship of the scene for glass surface detection with two novel modules: (1) Scene Aware Activation (SAA) Module to adaptively filter critical channels with respect to spatial and semantic features, and (2) Context Correlation Attention (CCA) Module to progressively learn the contextual correlations among objects both spatially and semantically. In addition, we propose a large-scale glass surface detection dataset named GSD-S, which contains 4,519 real-world RGB glass surface images from diverse real-world scenes with detailed annotations. Experimental results show that our model outperforms contemporary works, especially with 48.8% improvement on MAE from our proposed GSD-S dataset. |
||||
Mirror Detection with the Visual Chirality Cue [paper] [code] Xin Tan, Jiaying Lin, Ke Xu, Pan Chen, Lizhuang Ma, and Rynson Lau IEEE Trans. on Pattern Analysis and Machine Intelligence (accepted) |
||||
|
||||
Input-Output: Given an input image, our network outputs a binary mask that indicates where mirrors are. Abstract. Mirror detection is challenging because the visual appearances of mirrors change depending on those of their surroundings. As existing mirror detection methods are mainly based on extracting contextual contrast and relational similarity between mirror and non-mirror regions, they may fail to identify a mirror region if these assumptions are violated. Inspired by a recent study of applying a CNN to help distinguish whether an image is flipped or not based on the visual chirality property, in this paper, we rethink this image-level visual chirality property and reformulate it as a learnable pixel level cue for mirror detection. Specifically, we first propose a novel flipping-convolution-flipping (FCF) transformation to model visual chirality as learnable commutative residual. We then propose a novel visual chirality embedding (VCE) module to exploit this commutative residual in multi-scale feature maps, to embed the visual chirality features into our mirror detection model. Besides, we also propose a visual chirality-guided edge detection (CED) module to integrate the visual chirality features with contextual features for detection refinement. Extensive experiments show that the proposed method outperforms state-of-the-art methods on three benchmark datasets. |
||||
Large-Field Contextual Feature Learning for Glass Detection [paper] [code] Haiyang Mei, Xin Yang, Letian Yu, Qiang Zhang, Xiaopeng Wei, and Rynson Lau IEEE Trans. on Pattern Analysis and Machine Intelligence (accepted) |
||||
|
||||
Input-Output: Given an input image, our network outputs a binary mask that indicates where glass surfaces are. Abstract. Glass is very common in our daily life. Existing computer vision systems neglect it and thus may have severe consequences, e.g., a robot may crash into a glass wall. However, sensing the presence of glass is not straightforward. The key challenge is that arbitrary objects/scenes can appear behind the glass. In this paper, we propose an important problem of detecting glass surfaces from a single RGB image. To address this problem, we construct the first large-scale glass detection dataset (GDD) and propose a novel glass detection network, called GDNet-B, which explores abundant contextual cues in a large field-of-view via a novel large-field contextual feature integration (LCFI) module and integrates both high-level and low-level boundary features with a boundary feature enhancement (BFE) module. Extensive experiments demonstrate that our GDNet-B achieves satisfying glass detection results on the images within and beyond the GDD testing set. We further validate the effectiveness and generalization capability of our proposed GDNet-B by applying it to other vision tasks, including mirror segmentation and salient object detection. Finally, we show the potential applications of glass detection and discuss possible future research directions. |
||||
Learning Semantic Associations for Mirror Detection [paper] [suppl] [code] [dataset] Huankang Guan, Jiaying Lin, and Rynson Lau Proc. IEEE CVPR, June 2022 |
||||
|
||||
Input-Output: Given an input image, our network outputs a binary mask that indicates where mirrors are. Abstract. Mirrors generally lack a consistent visual appearance, making mirror detection very challenging. Although recent works that are based on exploiting contextual contrasts and corresponding relations have achieved good results, heavily relying on contextual contrasts and corresponding relations to discover mirrors tend to fail in complex real-world scenes, where a lot of objects, e.g., doorways, may have similar features as mirrors. We observe that humans tend to place mirrors in relation to certain objects for specific functional purposes, e.g., a mirror above the sink. Inspired by this observation, we propose a model to exploit the semantic associations between the mirror and its surrounding objects for a reliable mirror localization. Our model first acquires class-specific knowledge of the surrounding objects via a semantic side-path. It then uses two novel modules to exploit semantic associations: 1) an Associations Exploration (AE) Module to extract the associations of the scene objects based on fully connected graph models, and 2) a Quadruple-Graph (QG) Module to facilitate the diffusion and aggregation of semantic association knowledge using graph convolutions. Extensive experiments show that our method outperforms the existing methods and sets the new state-of-the-art on both PMD dataset (f-measure: 0.844) and MSD dataset (f-measure: 0.889). |
||||
Rich Context Aggregation with Reflection Prior for Glass Surface Detection [paper] [suppl] [code] [dataset] Jiaying Lin, Zebang He, and Rynson Lau Proc. IEEE CVPR, June 2021 |
||||
|
||||
Input-Output: Given an input image, our network outputs a binary mask that indicates where glass surfaces are. Abstract. Glass surfaces appear everywhere. Their existence can however pose a serious problem to computer vision tasks. Recently, a method is proposed to detect glass surfaces by learning multi-scale contextual information. However, as it is only based on a general context integration operation and does not consider any specific glass surface properties, it gets confused when the images contain objects that are similar to glass surfaces and degenerates in challenging scenes with insufficient contexts. We observe that humans often rely on identifying reflections in order to sense the existence of glass and on locating the boundary in order to determine the extent of the glass. Hence, we propose a model for glass surface detection, which consists of two novel modules: (1) a rich context aggregation module (RCAM) to extract multi-scale boundary features from rich context features for locating glass surface boundaries of different sizes and shapes, and (2) a reflection-based refinement module (RRM) to detect reflection and then incorporate it so as to differentiate glass regions from non-glass regions. In addition, we also propose a challenging dataset consisting of 4,012 glass images with annotations for glass surface detection. Our experiments demonstrate that the proposed model outperforms state-of-the-art methods from relevant fields. |
||||
Progressive Mirror Detection [paper] [suppl] [code] [dataset] Jiaying Lin, Guodong Wang, and Rynson Lau Proc. IEEE CVPR, June 2020 |
||||
|
||||
Input-Output: Given an input image, our network outputs a binary mask that indicates where mirrors are. Abstract. The mirror detection problem is important as mirrors can affect the performances of many vision tasks. It is a difficult problem since it requires an understanding of global scene semantics. Recently, a method was proposed to detect mirrors by learning multi-level contextual contrasts between inside and outside of mirrors, which helps locate mirror edges implicitly. We observe that the content of a mirror reflects the content of its surrounding, separated by the edge of the mirror. Hence, we propose a model in this paper to progressively learn the content similarity between the inside and outside of the mirror while explicitly detecting the mirror edges. Our work has two main contributions. First, we propose a new relational contextual contrasted local (RCCL) module to extract and compare the mirror features with its corresponding context features, and an edge detection and fusion (EDF) module to learn the features of mirror edges in complex scenes via explicit supervision. Second, we construct a challenging benchmark dataset of 6,461 mirror images. Unlike the existing MSD dataset, which has limited diversity, our dataset covers a variety of scenes and is much larger in scale. Experimental results show that our model outperforms relevant state-of-the-art methods. |
||||
|
||||
|
Last updated in September 2022.