Izenburua
Short Messages Spam Filtering Using Sentiment AnalysisEgilea (beste erakunde batekoa)
Bertsioa
Postprinta
Eskubideak
© Springer International Publishing Switzerland 2016Sarbidea
Sarbide irekiaArgitaratzailearen bertsioa
http://dx.doi.org/10.1007/978-3-319-45510-5_17Non argitaratua
Text, Speech, and Dialogue: 19th International Conference, TSD 2016, Brno , Czech Republic, September 12-16, 2016, Proceedings Vol. 9924. Lecture Notes in Computer Science. Pp 142-153, 2016Argitaratzailea
Springer International PublishingGako-hitzak
SMS
spam
polarity
sentiment analysis ... [+]
spam
polarity
sentiment analysis ... [+]
SMS
spam
polarity
sentiment analysis
security [-]
spam
polarity
sentiment analysis
security [-]
Laburpena
In the same way that short instant messages are more and more used, spam and non-legitimate campaigns through this type of communication systems are growing up. Those campaigns, besides being an illeg ... [+]
In the same way that short instant messages are more and more used, spam and non-legitimate campaigns through this type of communication systems are growing up. Those campaigns, besides being an illegal online activity, are a direct threat to the privacy of the users. Previous short messages spam filtering techniques focus on automatic text classification and do not take message polarity into account. Focusing on phone SMS messages, this work demonstrates that it is possible to improve spam filtering in short message services using sentiment analysis techniques. Using a publicly available labelled (spam/legitimate) SMS dataset, we calculate the polarity of each message and aggregate the polarity score to the original dataset, creating new datasets. We compare the results of the best classifiers and filters over the different datasets (with and without polarity) in order to demonstrate the influence of the polarity. Experiments show that polarity score improves the SMS spam classification, on the one hand, reaching to a 98.91% of accuracy. And on the other hand, obtaining a result of 0 false positives with 98.67% of accuracy. [-]