Title: SUBROUTINE ENTRY POINT RECOGNITION USING DATA MINING

Issue Number: Vol. 5, No. 4
Year of Publication: Oct - 2015
Page Numbers: 233-242
Authors: Brian Knudson
Journal Name: International Journal of Digital Information and Wireless Communications (IJDIWC)
- Hong Kong
DOI:  http://dx.doi.org/10.17781/P001743

Abstract:


This paper introduces a novel approach to subroutine entry point recognition using data mining. The proposed method applies a Naïve Bayes classifier over features consisting of sequences of normalized disassembled instructions and sequences of preceding bytes. These features combined account for properties of compilers that introduce code at the start of subroutines and padding bytes before the start of subroutines. Experiments were conducted on a dataset consisting of Windows PE32 x86 binaries generated from a collection of small open-source applications for Windows using several compiler settings. Ten-fold cross-validation was applied for training and testing the classifier. The proposed method achieves an average true positive rate 98% with a false positive rate of 0.7% for certain features.