Title: STRUCTURING HETEROGENEOUS BIG DATA FOR SCALABILITY AND ACCURACY

Issue Number: Vol. 4, No. 1
Year of Publication: 2014
Page Numbers: 10-23
Authors: Ashraf Gaffar, Eman Monir Darwish, Abdessamad Tridane
Journal Name: International Journal of Digital Information and Wireless Communications (IJDIWC)
- Hong Kong
DOI:  http://dx.doi.org/10.17781/P001079

Abstract:


Structured data has an inherently great automation value. It renders itself readily for software tools to help store, organize and search effectively. With the growing dependence of data, we face many new problems. While software applications replace each other, and older software has rapidly diminishing value, data has the extreme opposite nature, which we can call “cumulative effect”. Unlike software, we see new data as an addition to the old one, so we tend to continuously accumulate date without deleting any thing. Even older data is often archived for it’s value, for legal reasons, or just because we can never be sure if we’d need them again. This would not be a problem if we had structured data, as we can automate the storing and retrieval process. However, most of the valuable information lie inside unstructured data, which is extremely difficult to store and retrieve in large scale. Our work shows a new concept of adding structure to highly unstructured, heterogeneous data, which will greatly improve the total process of storing and effectively retrieving them. We use Human- Computer-Interaction (HCI) patterns as a case study to present our proof of concept.