Persian SMS Spam Detection using Machine Learning and Deep Learning Techniques

Document Type : Original Article


1 Department of Information Technology Science and Research Branch, Islamic Azad University Tehran, Iran

2 Department of Computer Engineering, Science and Research Branch, Islamic Azad University Tehran, Iran

3 Iran telecom IT Research faculty, ICT research institute, Tehran, Iran research center


Spams are well-known examples of unsolicited text or messages which are sent by unknown individuals and cause issues for smartphone users. The inconvenience imposed on users, the loss of network traffic, the rise in the calculated cost, occupying more physical space on the mobile phone, and abusing and defrauding recipients are but a few of their downsides. Consequently, the automated identification of  suspicious and spam messages is undoubtedly vitally important. Additionally, text messages which are smartly composed might be difficult to recognize. However, the present methodologies in this subject are hindered by the absence of adequate Persian datasets. A huge body of research and experiments has revealed that techniques based on deep and combined learning are superior at identifying unpleasant text messages. This work sought to develop an effective strategy for identifying SMS spam through utilizing combining machine learning classification algorithms together with deep learning models. After applying  preprocessing on our gathered dataset, the suggested technique applies two convolutional neural network layers, the first of which being an LSTM layer, and the second one which is a fully connected layer to extract the data characteristics, thereby implementing the suggested deep learning approach. As part of the Machine Learning methodologies, the vector support machine makes use of the data and features at hand to determine the ultimate classification. Results indicate that the suggested model is implemented more effectively than the existing techniques, and an accuracy of 97.7% was achieved as a result.


Main Subjects