File(s) not publicly available

Place perception from the fusion of different image representation

journal contribution
posted on 14.10.2020, 00:00 authored by P Li, X Li, H Pan, Mohammad KhyamMohammad Khyam, M Noor-A-Rahim, SS Ge
Inspired by the human way of place understanding, we present a novel indoor place perception network to overcome: 1). the simplicity of existing methods that only use the image features of object regions to recognize the indoor place, 2). insufficient consideration of the semantic information about object attributes and states. By utilizing multi-modal information containing the image and natural language, the proposed method can comprehensively express the attributes, state, and relationships of objects which are beneficial for indoor place understanding and recognition. Specifically, we first present a natural language generation framework based on a Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM) to imitate the process of place understanding. Next, a Convolutional Auto-Encoder (CAE) and a mixed CNN-LSTM are proposed to extract image features and semantic features, respectively. Then, two different fusion strategies, namely feature-level fusion and object-level fusion, are designed to integrate different types of features and features from different objects. The category of the indoor place is finally recognized based on fused information. Comprehensive experiments are conducted on public datasets, and the results verify the effectiveness of the proposed place perception method based on linguistic cues.

History

Volume

110

Start Page

1

End Page

11

Number of Pages

11

ISSN

0031-3203

Publisher

Elsevier

Language

en

Peer Reviewed

Yes

Open Access

No

Acceptance Date

23/09/2020

External Author Affiliations

Southeast University, China; University College Cork, Ireland; National University of Singapore

Era Eligible

Yes

Journal

Pattern Recognition