File(s) not publicly available
Mobile malware detection with imbalanced data using a novel synthetic oversampling strategy and deep learning
conference contribution
posted on 2021-06-25, 00:45 authored by Mahbub E Khoda, Joarder Kamruzzaman, Iqbal Gondal, Tasadduq ImamTasadduq Imam, Ashfaqur RahmanMobile malware detection is inherently an imbalanced data problem since the number of benign applications in the market is far greater than the number of malicious applications. Existing methods to handle imbalanced data, such as synthetic minority over-sampling, do not translate well into this domain since mobile malware detection generally deals with binary features and these methods are designed for continuous features. Also, methods adapted for categorical features cannot be applied here since random modifications of features can result in invalid sample generation. In this work, we propose a novel technique for generating synthetic samples for mobile malware detection with imbalanced data. Our proposed method adds new data points in the sample space by generating synthetic malware samples which also preserves the original functionality of the malicious apps. Experiments show that the proposed approach outperforms existing techniques in terms of precision, recall, F1score, and AUC. This study will be useful in building deep neural network-based systems to handle imbalanced data for mobile malware detection.
History
Volume
2020-OctoberStart Page
1End Page
6Start Date
2020-10-12Finish Date
2020-10-14eISSN
2161-9654ISSN
2161-9646ISBN-13
9781728197227Location
Thessaloniki, GreecePublisher
IEEEPlace of Publication
OnlinePublisher DOI
Peer Reviewed
- Yes
Open Access
- No