File(s) not publicly available
Place perception from the fusion of different image representation
journal contribution
posted on 2020-10-14, 00:00 authored by P Li, X Li, H Pan, Mohammad KhyamMohammad Khyam, M Noor-A-Rahim, SS GeInspired by the human way of place understanding, we present a novel indoor place perception network to overcome: 1). the simplicity of existing methods that only use the image features of object regions to recognize the indoor place, 2). insufficient consideration of the semantic information about object attributes and states. By utilizing multi-modal information containing the image and natural language, the proposed method can comprehensively express the attributes, state, and relationships of objects which are beneficial for indoor place understanding and recognition. Specifically, we first present a natural language generation framework based on a Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM) to imitate the process of place understanding. Next, a Convolutional Auto-Encoder (CAE) and a mixed CNN-LSTM are proposed to extract image features and semantic features, respectively. Then, two different fusion strategies, namely feature-level fusion and object-level fusion, are designed to integrate different types of features and features from different objects. The category of the indoor place is finally recognized based on fused information. Comprehensive experiments are conducted on public datasets, and the results verify the effectiveness of the proposed place perception method based on linguistic cues.
History
Volume
110Start Page
1End Page
11Number of Pages
11ISSN
0031-3203Publisher
ElsevierPublisher DOI
Language
enPeer Reviewed
- Yes
Open Access
- No
Acceptance Date
2020-09-23External Author Affiliations
Southeast University, China; University College Cork, Ireland; National University of SingaporeEra Eligible
- Yes