Title: Text Classification Using Time Windows Applied to Stock Exchange

Issue Number: Vol. 7, No. 2
Year of Publication: Jun - 2017
Page Numbers: 62-67
Authors: Pavel Netolicky, Jonas Petrovsky, Frantisek Darena, Jan Zizka
Journal Name: International Journal of New Computer Architectures and their Applications (IJNCAA)
- Hong Kong


Each day, a lot of text data is generated. This data comes from various sources and may contain valuable information. In this article, we use text classification to discover if there is a connection between textual documents (specifically Facebook posts) and changes of the S&P 500 stock index. The index values and documents were divided into time windows according to the direction of the index value changes. In the first experiment, we used a batch processing approach to put the documents from all windows into one data set and a classification accuracy of 62% was achieved. In the second experiment, we used a data stream approach to divide documents into twelve data sets created from two neighboring windows and we achieved an accuracy of 68%. This indicates that posts, which companies write on their Facebook pages, are partially related to the performance of the stock index. Taking the concept change into account also enables better quantification of this relationship.