International Journal of Web Research

International Journal of Web Research

Authentic and Fake Reviews Recognition on E-Commerce Websites through Sentiment Analysis and Machine Learning Techniques

Document Type : Original Article

Authors
1 Department of Computer Engineering, Tabriz Branch, Islamic Azad University, Tabriz, Iran
2 Computer Engineering Department, Istinye University, Istanbul, Turkey
Abstract
The proliferation of e-commerce has led to an overwhelming volume of customer reviews, posing challenges for consumers who seek reliable product evaluations and for businesses concerned with the integrity of their online reputation. This study addresses the critical problem of detecting fake reviews by developing a comprehensive framework that integrates Natural Language Processing (NLP) and machine learning techniques. Our methodology centers on sentiment analysis to discern the emotional valence of reviews, coupled with Part-of-Speech (PoS) tagging to analyze linguistic patterns that may signal deception. We meticulously extract a rich set of textual and statistical features, providing a robust basis for our predictive models. To enhance classification performance, we strategically employ both traditional machine learning algorithms and powerful ensemble techniques. Experimental results underscore the efficacy of our approach in detecting fraudulent reviews. We achieved a notable F1-Score of 82.9% and an accuracy of 82.6%, demonstrating the potential to safeguard consumers from misleading information and protect businesses from unfair practices.
Keywords

  • Mani, S. Kumari, A. Jain, and P. Kumar, "Spam review detection using ensemble machine learning," in International Conference on Machine Learning and Data Mining in Pattern Recognition, 2018, pp. 198-209.
  • A. Patel and R. Patel, "A survey on fake review detection using machine learning techniques," in 2018 4th International Conference on Computing Communication and Automation (ICCCA), 2018, pp. 1-6.
  • Salminen, C. Kandpal, A. M. Kamel, S.-g. Jung, and B. J. Jansen, "Creating and detecting fake reviews of online products," Journal of Retailing and Consumer Services, vol. 64, p. 102771, 2022.
  • Ennaouri and A. Zellou, "Fake Reviews Detection through Machine learning Algorithms: A Systematic Literature Review," 2022.
  • Bathla, P. Singh, R. K. Singh, E. Cambria, and R. Tiwari, "Intelligent fake reviews detection based on aspect extraction and analysis using deep learning," Neural Computing and Applications, pp. 1-17, 2022.
  • C. Shetty, "Learning to detect fake online reviews using readability tests and text analytics," Dublin, National College of Ireland, 2019.
  • Banerjee, A. Y. Chua, and J.-J. Kim, "Using supervised learning to classify authentic and fake online reviews," in Proceedings of the 9th international conference on Ubiquitous Information Management and Communication, 2015, pp. 1-7.
  • Anderson and D. Simester, "Deceptive reviews: the influential tail," Tech Rep, vol. 2, p. 1, 2013.
  • Elmurngi and A. Gherbi, "An empirical study on detecting fake reviews using machine learning techniques," in 2017 seventh international conference on innovative computing technology (INTECH), 2017, pp. 107-114.
  • Ott, Y. Choi, C. Cardie, and J. T. Hancock, "Finding deceptive opinion spam by any stretch of the imagination," arXiv preprint arXiv:1107.4557, 2011.
  • Mukherjee, V. Venkataraman, B. Liu, and N. Glance, "What yelp fake review filter might be doing?," in Proceedings of the international AAAI conference on web and social media, 2013.
  • Ahmed, I. Traore, and S. Saad, "Detection of online fake news using n-gram analysis and machine learning techniques," in International conference on intelligent, secure, and dependable systems in distributed and cloud environments, 2017, pp. 127-138.
  • Lee, J. Ham, S.-B. Yang, and C. Koo, "Can you identify fake or authentic reviews? An fsQCA approach," in Information and communication technologies in tourism 2018, ed: Springer, 2018, pp. 214-227.
  • Elmurngi and A. Gherbi, "Detecting fake reviews through sentiment analysis using machine learning techniques," IARIA/data analytics, pp. 65-72, 2017.
  • Li, B. Liu, A. Mukherjee, and J. Shao, "Spotting fake reviews using positive-unlabeled learning," Computación y Sistemas, vol. 18, pp. 467-475, 2014.
  • Noekhah, N. binti Salim, and N. H. Zakaria, "Opinion spam detection: Using multi-iterative graph-based model," Information Processing & Management, vol. 57, p. 102140, 2020.
  • Algotar and A. Bansal, "Detecting Truthful and Useful Consumer Reviews for Products using Opinion Mining," in EMSASW@ ESWC, 2018, pp. 63-72.
  • M. Danish, S. M. Tanzeel, N. Usama, A. Muhammad, A. Martinez-Enriquez, and A. Muhammad, "Intelligent interface for fake product review monitoring and removal," in 2019 16th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE), 2019, pp. 1-6.
  • Ni, J. Li, and J. McAuley, "Justifying recommendations using distantly-labeled reviews and fine-grained aspects," in Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), 2019, pp. 188-197.
  • Liu, "Sentiment analysis and opinion mining," Synthesis lectures on human language technologies, vol. 5, pp. 1-167, 2012.
  • Crawford, T. M. Khoshgoftaar, J. D. Prusa, A. N. Richter, and H. Al Najada, "Survey of review spam detection using machine learning techniques," Journal of Big Data, vol. 2, pp. 1-24, 2015.
  • H. Li, M. Huang, Y. Yang, and X. Zhu, "Learning to identify review spam," in Twenty-second international joint conference on artificial intelligence, 2011.
  • Hajek, A. Barushka, and M. Munk, "Fake consumer review detection using deep neural networks integrating word embeddings and emotion mining," Neural Computing and Applications, vol. 32, pp. 17259-17274, 2020.
  • Abri, L. F. Gutierrez, A. S. Namin, K. S. Jones, and D. R. Sears, "Fake reviews detection through analysis of linguistic features," arXiv preprint arXiv:2010.04260, 2020.
  • Gutierrez-Espinoza, F. Abri, A. S. Namin, K. S. Jones, and D. R. Sears, "Fake reviews detection through ensemble learning," arXiv preprint arXiv:2006.07912, 2020.
  • K. Jain, R. Pamula, and S. Ansari, "A supervised machine learning approach for the credibility assessment of user-generated content," Wireless Personal Communications, vol. 118, pp. 2469-2485, 2021.