Title: 'Fuzzy' vs 'Non-Fuzzy' Classification in Big Data

Year of Publication: Dec - 2015
Page Numbers: 23-32
Authors: Malak EL-Bakry, Soha Safwat, Osman Hegazy
Conference Name: The Second International Conference on Digital Information Processing, Data Mining, and Wireless Communications (DIPDMWC2015)
- United Arab Emirates

Abstract:


Due to the huge increase in the size of the data it becomes troublesome to perform efficient analysis using the current traditional techniques. Big data puts forward a lot of challenges due to its several characteristics like volume, velocity, variety, variability, value and complexity. Today, there is not only a necessity for efficient data mining techniques to process large volume of data but also a need for a means to meet the computational requirements to process such huge volume of data. The objective of this research is to compare fuzzy and non-fuzzy algorithms in classification of big data, and to provide a comparative study between the results of this study and the methods reviewed in the literature. In this paper, we implemented the Fuzzy K-Nearest Neighbor method as a fuzzy technique and the Support Vector Machine as non-fuzzy technique using the map reduce paradigm to process on big data. Results on different data sets show that the proposed Fuzzy K Nearest Neighbor method outperforms a better performance than the Support Vector Machine and the method reviewed in the literature.