Title: On the Automatic Categorization of Arabic Articles Based on Their Political Orientation

Year of Publication: Sep - 2014
Page Numbers: 302-309
Authors: Raddad Abooraig, Ahmed Alwajeeh, Mahmoud Al-Ayyoub and Ismail Hmeidi
Conference Name: The Third International Conference on Informatics Engineering and Information Science (ICIEIS2014)
- Poland

Abstract:


The prevalence of the dynamic online web pages (such as the social networks, forums, personal Blogs, etc.) that are covering all fields (such as social events, economical events, political events, etc.) are allowing the Internet surfers to interact with their contents such as writing comments and articles. Regarding politics and political events, the Internet surfers post comments and articles based on their beliefs and ideologies. The ability to automatically determine the political orientation of an article can be of great benefit in many areas from Academia to security. This work addresses this important yet largely understudied problem for Arabic texts as a supervised learning problem. Aside from collecting and manually labeling a dataset of articles from different political orientations in the Arab world, the two most popular feature extraction approaches for such a problem (the TC approach and the stylometric features approach) are studied. Moreover, four classifiers are considered to study the effects of different kinds of feature reduction techniques, such as stemming and feature selection, on their effectiveness. Although the experimentation results show the superiority of the TC approach over the stylometric features approach, they also show that the latter approach can be significantly improved by adding new and more discriminating features.