Comparative Analysis of Link-based and Content-based Methods for Opinion Mining in Persian language

Document Type: Original Article

Authors

1 Islamic Azad University, Science and Research Branch

2 Iran telecom research center

Abstract

Twitter has provided a convenient platform to express feelings and opinions in different areas. Opinion mining in Twitter can be considered as studying the overall sentiment of a tweet. There are two general categories of sentiment analysis methods in the Persian language, linked-base methods and, content-based methods. In this study, we implement a new link-based method for improving opinion classification in the Persian language.
To compare with the content-based method, we implement a content-based method using Naïve Bayes Method with two different weighting Methods: TF/IDF and Chi-Square. The TF/IDF method has good results in previous Persian language studies. The Chi-Square method has not been used in the Persian language researches, but the accuracy is fairly good in English.
The results show that the improvement in the language-independent methods is remarkable and is in accordance with this research, the precision of the proposed algorithm for positive and negative comments was 98.87% and 97.87%, and the recall value for positive and negative comments was 99.24% and 96.84% respectively. The results also show that because of complexities in Persian syntax and lack of proper natural language processing tools in Persian, content-based algorithms operate poorly compared to English.

Keywords

Main Subjects