Improving The SFA Algorithm by Employing Multi-Source Data

Document Type: Original Article

Authors

1 Yazd University

2 Computer dept. Yazd university

3 Science and Art Uniersity

Abstract

In recent years, discriminative learning methods have widely been used in various areas of Natural Language Processing (NLP). These methods achieve the best performance, when the set of training and testing samples have the same distribution. However, in many applications of NLP, the lack of labeled datasets for some domains is a serious challenge. In such conditions, we need to develop a model based on domains with rich labeled instances and apply it to the domain with no labeled instances. In this research, a method for sentiment classification of opinions into positive and negative groups, which represent the users' feelings, is offered based on multi-source transfer learning. The proposed method here employs Spectral Feature Alignment algorithm to adapt different domains. Furthermore, according to the Majority Voting, accuracy is assigned to classifications trained on different domains based on the Majority Voting Error. Ultimately, decisions are made for each classification based on the calculated error. The Amazon datasets for four different categories, each of which contains 1000 positive and 1000 negative samples, are exploited to train the proposed model. Meanwhile, each category includes unlabeled samples that are used to select pivot features. The accuracy values of 85.5%, 86.4%, 83.5% and 90.1% obtained for Electronics, DVD, Books and Kitchen domains respectively, show the effectiveness of the proposed method compared with similar methods.

Keywords

Main Subjects