Stealthy Malware Detection using RNN-Based Automated Localized Feature Extraction and Classifier
Stealthy Malware Detection using RNN-Based Automated Localized Feature Extraction and Classifier
Sanket Shukla,Gaurav Kolhe,Sai Manoj Pudukotai Dinakarrao,S. Rafatirad
Abstract
Malware analysis, detection and classification has allured a lot of researchers in the past few years. Numerous methods based on machine learning (ML), computer vision and deep learning have been applied to this task and have accomplished some pragmatic results. One of the basic assumption of these works is that malware is spawned as a separate thread and the distinguishing features can be extracted in a "clean" manner irrespective of the malware obfuscation deployed. However, this assumption does not hold true for the advanced malware obfuscation techniques such as code relocation, mutation and polymorphism. Stealthy malware is a malware created by embedding the malware in a benign application through advanced obfuscation strategies to thwart the detection. To perform efficient malware detection for traditional and stealthy malware alike, we propose a two-pronged approach. Firstly, we extract the microarchitectural traces obtained while executing the application, which are fed to the traditional ML classifiers to detect malware spawned as separate thread. In parallel, for an efficient stealthy malware detection, we introduce an automated localized feature extraction technique that will be further processed using the recurrent neural networks (RNNs) for classification. To perform this, we translate the application binaries into images and further convert it into sequences and extract local features for stealthy malware detection. With the proposed two-pronged approach, an accuracy of 94% and nearly 90% is achieved in detecting normal and stealthy malware created through code relocation obfuscation technique. Furthermore, the proposed approach achieves up to 11% higher detection accuracy compared to the CNN-based sequence classification and hidden Markov model (HMM) based approaches in detecting stealthy malware.
