Title: ON ENHANCING DATA UTILITY IN K-ANONYMIZATION FOR DATA WITHOUT HIERARCHICAL TAXONOMIES

Issue Number: Vol. 2, No. 2
Year of Publication: 2013
Page Numbers: 12-22
Authors: Mohammad Rasool Sarrafi Aghdam, Noboru Sonehara
Journal Name: International Journal of Cyber-Security and Digital Forensics (IJCSDF)
- Hong Kong

Abstract:


K-anonymity is the model that is widely used to protect the privacy of individuals in publishing micro-data. It could be defined as clustering with constrain of minimum k tuples in each group. K-anonymity cuts down the linking confidence between sensitive information and specific individual by the ration of 1/k. However, the accuracy of the data in k-anonymous dataset decreases due to information loss. Moreover, most of the current approaches are for numerical attributes or in case of categorical attributes they require extra information such as attribute hierarchical taxonomies which often do not exist. In this paper we propose a new model, based on clustering, defining the distance between tuples including numerical and categorical attributes which does not require extra information and present the SpatialDistance (SD) heuristic algorithm. Comparisons of experimental results on real datasets between SD algorithm and existing well-known algorithms show that SD performs the best and offers much higher data utility and reduces the information loss significantly.