Title: SUBROUTINE ENTRY POINT RECOGNITION USING DATA MINING
Issue Number: | Vol. 5, No. 4 |
Year of Publication: | Oct - 2015 |
Page Numbers: | 233-242 |
Authors: | Brian Knudson |
Journal Name: | International Journal of Digital Information and Wireless Communications (IJDIWC) - Hong Kong |
DOI: http://dx.doi.org/10.17781/P001743
Abstract:
This paper introduces a novel approach to subroutine entry point recognition using data mining. The proposed method applies a Naïve Bayes classifier over features consisting of sequences of normalized disassembled instructions and sequences of preceding bytes. These features combined account for properties of compilers that introduce code at the start of subroutines and padding bytes before the start of subroutines. Experiments were conducted on a dataset consisting of Windows PE32 x86 binaries generated from a collection of small open-source applications for Windows using several compiler settings. Ten-fold cross-validation was applied for training and testing the classifier. The proposed method achieves an average true positive rate 98% with a false positive rate of 0.7% for certain features.