Issue Number: Vol. 4, No. 1
Year of Publication: 2014
Page Numbers: 1-16
Authors: Ashraf Gaffar
Journal Name: International Journal of New Computer Architectures and their Applications (IJNCAA)
- Hong Kong
DOI:  http://dx.doi.org/10.17781/p001


In the domain of Human Computer Interaction (HCI), the communication between humans and software is an ongoing challenge. On the one hand, humans have growing needs and expectations. On the other hand, software has more advanced features and functionalities. This dual complexity growth results in difficulty in interaction. User interface is the battleground where sophisticated cognitive demands need to be resolved between the user and the software for effective communication. When interaction involves big data retrieval, communication often takes the form of software displaying data, and users comprehending their meaning, then requesting more date, or requesting some processing to be applied on the displayed data. We explore two main issues of big data user interaction: Encoding and structure. First, data encoding is typically application-dependent, and has little value without its owner software. Similarly, data structuring will make all data items follow welldefined structure (e.g. tables) that preserves its meaning for software tools to manipulate. Structure is preferred as it allows for easy and accurate data processing, but the majority of valuable data is unstructured, heterogeneous sets of mixed contents (text, links, tables, graphics, audio, video, etc.). While great for human consumption, it is not scalable and does not lend itself readily to tools. We propose a new approach for “stand-alone” big data approach that has application-agnostic encoding using XML, and is also well structured for easy processing. We show a comprehensive process of creating and disseminating large amounts of data effectively with both syntax and semantic contents built-in and preserved, independent of any application used. We use text-based patterns as a case study to demonstrate the problem with big, heterogeneous, rich data and build a system to help its dissemination and assimilation process.