CQUniversity
Browse

Mobile malware detection with imbalanced data using a novel synthetic oversampling strategy and deep learning

conference contribution
posted on 2021-06-25, 00:45 authored by Mahbub E Khoda, Joarder Kamruzzaman, Iqbal Gondal, Tasadduq ImamTasadduq Imam, Ashfaqur Rahman
Mobile malware detection is inherently an imbalanced data problem since the number of benign applications in the market is far greater than the number of malicious applications. Existing methods to handle imbalanced data, such as synthetic minority over-sampling, do not translate well into this domain since mobile malware detection generally deals with binary features and these methods are designed for continuous features. Also, methods adapted for categorical features cannot be applied here since random modifications of features can result in invalid sample generation. In this work, we propose a novel technique for generating synthetic samples for mobile malware detection with imbalanced data. Our proposed method adds new data points in the sample space by generating synthetic malware samples which also preserves the original functionality of the malicious apps. Experiments show that the proposed approach outperforms existing techniques in terms of precision, recall, F1score, and AUC. This study will be useful in building deep neural network-based systems to handle imbalanced data for mobile malware detection.

History

Volume

2020-October

Start Page

1

End Page

6

Start Date

2020-10-12

Finish Date

2020-10-14

eISSN

2161-9654

ISSN

2161-9646

ISBN-13

9781728197227

Location

Thessaloniki, Greece

Publisher

IEEE

Place of Publication

Online

Peer Reviewed

  • Yes

Open Access

  • No

Name of Conference

16th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob 2020)

Parent Title

2020 16th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob)