ارائه رویکرد تنسور سه بعدی برای طبقه‌بندی و تشخیص اخبار جعلی: مطالعه موردی اخبار فارسی در حوزه کرونا ویروس

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشجوی دکتری، گروه مدیریت فناوری اطلاعات، واحد قشم، دانشگاه آزاد اسلامی، قشم، ایران.

2 استادیار، گروه علوم کامپیوتر، واحد کاشان، دانشگاه آزاد اسلامی، کاشان، ایران

3 استادیار، گروه مدیریت، واحد تهران مرکزی، دانشگاه آزاد اسلامی، تهران، ایران.

4 دانشیار، گروه مدیریت، واحد تهران مرکزی، دانشگاه آزاد اسلامی، تهران، ایران.

چکیده

هدف: هدف پژوهش حاضر اختصاص یکی از کلاس‌‌های جعل و واقعی به متن‌‌های آزاد می‌باشد. شبکه‌‌های عصبی کانولوشنی به عنوان یکی از مهم‌‌ترین مدل‌‌های یادگیری عمیق، دقت بالایی را بر روی این مسائل بدست آورده است. در این تحقیق آنالیز متن در سطح جمله و بهبود عملکرد شبکه عصبی کانولوشنی جهت تشخیص اخبار جعلی مورد توجه بوده است. در اﯾﻦ ﺷﺒﮑﻪ‌‌ﻫﺎ ﮐﻠﻤﺎت ﺑﻪ ﺻﻮرت ﮐﯿﺴﻪ‌‌ای از ﮐﻠﻤﺎت ﺑﻪ ﻣﺪل داده ﻣﯽ‌‌ﺷﻮﻧﺪ ﮐﻪ ﻫﺮ ﮐﻠﻤﻪ ﺑﺎ ﺗﻮﺟﻪ ﺑﻪ ﻓﻀﺎی ﺑﺮداری ﺑﻪ ﻣﺎﺗﺮﯾﺲ‌‌ﻫﺎی دو ﺑﻌﺪی ﺗﺒﺪﯾﻞ ﻣﯽ‌‌ﺷود. یکی از محدودیت‌‌های شبکه‌‌های کانولوشن این است که در سطح کلمه کار کرده و نمی‌‌تواند رابطه و فاصله بین جملات را در نظر بگیرد و آﻧﺎﻟﯿﺰ در ﺳﻄﺢ ﺟﻤﻠﻪ مشکل اساسی در این تحقیق می‌‌باشد. در این پژوهش یک مدل پایه‌‌ای مبتنی بر شبکه‌‌های کانولوشنی پیشنهاد شده که در آن اسناد به صورت تنسورهای سه بعدی به شبکه داده می‌‌شوند تا بتواند مشکل مذکور را مرتفع نماید. در نظر گرفتن تنسورهای سه بعدی امکان یادگیری موقعیت کلمات در جمله را برای مدل فراهم می‌‌آورد و به نتایج دقیق‌تری در تشخیص اخبار جعل دست می‌یابد.
روش‌‌شناسی: پژوهش حاضر مطالعه‌ای کاربردی بوده که در آن حدود 42000 اخبار فارسی از شهرهای مختلف ایران از توییتر جمع‌‌آوری شده و با عمل پیش‌پردازش، داده‌های اضافی و غیر مفید حذف و پس از برچسب زدن متون پاک‌سازی شده، متن اخبار جهت رویکرد پیشنهادی با استفاده از نرم‌افزار پایتون پردازش شده‌اند.
یافته‌‌ها: برخی از الگوریتم‌‌های یادگیری ماشین دارای قدرت بیشتری در مسائل طبقه‌‌بندی بودند، ولی با تغییراتی که در ساختار الگوریتم شبکه کانولوشن صورت گرفت، نتایج بهتری نسبت به الگوریتم‌‌های یادگیری ماشین و سایر الگوریتم‌‌های مشابه حاصل شد.
نتیجه‌‌گیری: در نظر گرفتن تنسورهای سه بعدی امکان یادگیری موقعیت کلمات در جمله را برای مدل فراهم می‌آورد و این مدل پیشنهادی در مقایسه با رویکردهای پیشنهادی در ادبیات، دقت قابل توجهی را بدست آورده است. مدل پیشنهادی بدون اضافه کردن سربار اضافی از لحاظ تعداد ویژگی‌ها و عمق شبکه، با تغییر در ورودی توانسته است به نتایج بهتر و قابل قبول از سایر رویکردهای موجود در ادبیات دست یافته و به دقت و صحّت بیش از 94 درصد دست یابد.
 

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Providing a Three-Dimensional Tensor Approach For Classifying and Detecting Fake News - A Case Study of Persian News in The Field of COVID-19

نویسندگان [English]

  • Vahid Mottaghi 1
  • Mahdi Esmaeili 2
  • Ghasem Ali Bazaee 3
  • Mohammad Ali Afshar Kazemi 4
1 PhD. Candidate in IT Management, Department of IT Management, Qeshm Branch, Islamic Azad University, Qeshm, Iran.
2 Assistant Professor, Department of Computer Science, Kashan Branch, Islamic Azad University, Kashan, Iran
3 Assistant Professor, Department of Management, Central Tehran Branch, Islamic Azad University, Tehran, Iran.
4 Associate Professor, Department of Management, Central Tehran Branch, Islamic Azad University, Tehran, Iran
چکیده [English]

Purpose: Convolutional neural networks, as one of the most important models of deep learning, have gained high accuracy on these issues. In this study, discussion and text analysis at the sentence level and improving the performance of neural networks to detect fake news has been convolution. The network of words for bags of words in the data model so that each word according to the two-dimensional vector space to become matrices. One of the limitations of convolutional networks is that it works at the word level and cannot consider the relationship and distance between sentences. And sentence-level analysis is a major problem in this research. Sentence level analysis is a major problem in this research.
In this research, a basic model based on convolutional networks is proposed in which documents are given to the network in the form of 3D tensors to solve the mentioned problem. Considering 3D tensors allows the model to learn the position of words in a sentence and achieve more accurate results in detecting fake news.
Methodology: This study is applied research in which about 42,000 Persian news from different cities of Iran were collected from Twitter and using preprocessing, additional and useless data is deleted and after tagging the deleted texts, the news text is used for the proposed approach using Python software and related libraries.
Findings: During testing, some machine learning algorithms had more power in classification problems, but with the changes in the structure of the convolutional network algorithm, better results were obtained than machine learning algorithms and other similar algorithms.
Conclusion: Considering 3D tensors allows the model to learn the position of words in a sentence, and this proposed model has gained considerable accuracy compared to the proposed approaches in the literature. The proposed model without adding additional overhead in terms of the number of features and network depth, by changing the input has been able to achieve better and more acceptable results than other approaches in the literature and achieve an accuracy of more than 94%.
 

کلیدواژه‌ها [English]

  • Natural Language Processing
  • Text Classification
  • Convolutional Neural Networks
  • Fake News Detection
Afroz, S.; Brennan, M. & Greenstadt, R. (2012). Detecting hoaxes, frauds, and deception in writing style online. In: 2012 IEEE Symposium on Security and Privacy (pp. 461-475). IEEE.
Aker, A.; Derczynski, L. & Bontcheva, K. (2017). Simple open stance classification for rumour analysis. Proceedings of the International Conference Recent Advances in Natural Language Processing: 31-39. DOI: 10.26615/978-954-452-049-6_005.
Allcott, H. & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of economic perspectives, 31(2): 211-36.
Allport, G.W. & Postman, L. (1946). An analysis of rumor. Public opinion quarterly, 10(4): 501-517.
Andorfer, A (2017). Spreading like wildfire: Solutions for abating the fake news problem on social media via technology controls and government regulation. Hastings LJ, 69: 1409.
Berkowitz, D. & Schwartz, D.A. (2016). Miley, CNN and The Onion: When fake news becomes realer than real. Journalism practice, 10(1): 1-17.
Briscoe, E.J.; Appling, D.S. & Hayes, H. (2014). Cues to deception in social media communications. In: 2014 47th Hawaii international conference on system sciences (pp.1435-1443). IEEE.
Castillo, C.; Mendoza, M. & Poblete, B. (2011). Information credibility on twitter. In: Proceedings of the 20th international conference on World wide web (pp.675-684).
Chang, C.; Zhang, Y.; Szabo, C. & Sheng, Q.Z. (2016). Extreme user and political rumor detection on twitter. In: International Conference on Advanced Data Mining and Applications (pp.751-763). Springer, Cham.
Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H. & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1724-1734. DOI: 10.3115/v1/D14-1179.
Conroy, N.K.; Rubin, V.L. & Chen, Y. (2015). Automatic deception detection: Methods for finding fake news. Proceedings of the Association for Information Science and Technology, 52(1): 1-4.
Derczynski, L. & Bontcheva, K. (2014). Pheme: Veracity in Digital Social Networks. In: UMAP workshops.
Giasemidis, G.; Singleton, C.; Agrafiotis, I.; Nurse, J.R.; Pilgrim, A.; Willis, C. & Greetham, D.V. (2016). Determining the veracity of rumours on Twitter. In: International Conference on Social Informatics (pp.185-205). Springer, Cham.
Gorrell, G.; Bontcheva, K.; Derczynski, L.; Kochkina, E.; Liakata, M. & Zubiaga, A. (2018). Rumoureval 2019: Determining rumour veracity and support for rumours. Proceedings of the 13th International Workshop on Semantic Evaluation. 845-854. DOI: 10.18653/v1/S19-2147.
Jacovi, A.; Shalom, O.S. & Goldberg, Y. (2018). Understanding convolutional neural networks for text classification. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. pp: 56-65. DOI: 10.18653/v1/W18-5408.
Kim, Y. (2014). Convolutional neural networks for sentence classification. CoRR abs/1408.5882 (2014). Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1746-1751. DOI: 10.3115/v1/D14-1181.
Knapp, R.H. (1944). A psychology of rumor. Public opinion quarterly, 8(1): 22-37.
Kochkina, E.; Liakata, M. & Zubiaga, A. (2018). All-in-one: Multi-task learning for rumour verification. Proceedings of the 27th International Conference on Computational Linguistics: 3402-3413.
Kohavi, R. & Quinlan, J.R. (2002). Data mining tasks and methods: Classification: decision-tree discovery. In: Handbook of data mining and knowledge discovery: 267-276.
Kshetri, N. & Voas, J. (2017). The economics of “fake news”. IT Professional, 19(6): 8-12.
Kucharski, A. (2016). Study epidemiology of fake news. Nature, 540(7634): 525-525.
Kwon, S.; Cha, M.; Jung, K.; Chen, W. & Wang, Y. (2013). Prominent features of rumor propagation in online social media. In: 2013 IEEE 13th international conference on data mining (pp. 1103-1108). IEEE.
LeCun, Y.; Kavukcuoglu, K. & Farabet, C. (2010). Convolutional networks and applications
in vision.
In: Proceedings of 2010 IEEE international symposium on circuits and systems
(pp.253-256). IEEE.
Ma, J.; Gao, W.; Mitra, P.; Kwon, S.; Jansen, B.J.; Wong, K.F. & Cha, M. (2016). Detecting rumors from microblogs with recurrent neural networks. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16): 3818-3824.
Meyer, J.K. (1969). Bibliography on the urban crisis: The behavioral, psychological, and sociological aspects of the urban crisis. In: proceedings Meyer 1969 Bibliography OT.
Pogue, D. (2017). How to Stamp Out Fake News. Scientific American, 316(2): 24-24.
Qin, Y.; Wurzer, D.; Lavrenko, V. & Tang, C. (2016). Spotting rumors via novelty detection. Vol. abs/1611.06322. n. pag.
Rapoza, K. (2017). Can ‘fake news’ impact the stock market?. Available at:       
https://www.forbes.com/sites/kenrapoza/2017/02/26/can-fake-news-impact-the-stock-market/?sh=293d89802fac.
Rubin, V.L.; Chen, Y. & Conroy, N.K. (2015). Deception detection for news: three types of fakes. Proceedings of the Association for Information Science and Technology, 52(1): 1-4.
Rubin, V.L.; Conroy, N.; Chen, Y. & Cornwell, S. (2016). Fake news or truth? using satirical cues to detect potentially misleading news. In: Proceedings of the second workshop on computational approaches to deception detection: 7-17.
Ruchansky, N.; Seo, S. & Liu, Y. (2017). Csi: A hybrid deep model for fake news detection. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management: 797-806.
Sharma, K.; Qian, F.; Jiang, H.; Ruchansky, N.; Zhang, M. & Liu, Y. (2019). Combating fake news: A survey on identification and mitigation techniques. ACM Transactions on Intelligent Systems and Technology (TIST), 10(3): 1-42.
Shu, K.; Sliva, A.; Wang, S.; Tang, J. & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter,19(1): 22-36.
Siering, M.; Koch, J.A. & Deokar, A.V. (2016). Detecting fraudulent behavior on crowdfunding platforms: The role of linguistic and content-based cues in static and dynamic contexts. Journal of Management Information Systems, 33(2): 421-455.
Tacchini, E.; Ballarin, G.; Della Vedova, M.L.; Moret, S. & de Alfaro, L. (2017). Some like it hoax: Automated fake news detection in social networks. In: Procceding in Conference SoGood 2017 - Second Workshop on Data Science for Social Good, Vol. 1960.
Vosoughi, S. (2015). Automatic detection and verification of rumors on Twitter. Doctoral dissertation. Massachusetts Institute of Technology.
Vosoughi, S.; Roy, D. & Aral, S. (2018). The spread of true and false news online. Science, 359 (6380): 1146-1151.
Waldrop, M.M. (2017). News Feature: The genuine problem of fake news. Proceedings of the National Academy of Sciences, 114(48): 12631-12634.
Yang, F.; Liu, Y.; Yu, X. & Yang, M. (2012). Automatic detection of rumor on sina weibo. In: Proceedings of the ACM SIGKDD workshop on mining data semantics: 1-7.
Zeng, L.; Starbird, K. & Spiro, E. (2016). # unconfirmed: Classifying rumor stance in crisis-related social media messages. Proceedings of the International AAAI Conference on Web and Social Media, 10(1). No.1.
Zhang, H.; Fan, Z.; Zheng, J. & Liu, Q. (2012). An improving deception detection method in computer-mediated communication. Journal of Networks, 7(11): 1811.
Zhou, L.; Twitchell, D.P.; Qin, T.; Burgoon, J.K. & Nunamaker, J.F. (2003). An exploratory study into deception detection in text-based computer-mediated communication. In: 36th Annual Hawaii International Conference on System Sciences. IEEE.
Zhou, X. & Zafarani, R. (2018). Fake news: A survey of research, detection methods, and opportunities. ACM Computing Surveys, 53(109): 1-40. DOI: 10.1145/3395046.
Zubiaga, A.; Aker, A.; Bontcheva, K.; Liakata, M. & Procter, R. (2018). Detection and resolution of rumours in social media: A survey. ACM Computing Surveys (CSUR), 51(2): 1-36.
Zubiaga, A.; Liakata, M. & Procter, R. (2016). Learning reporting dynamics during breaking news for rumour detection in social media. ACM Transactions on Management Information Systems, 12(8): 1-16. DOI: 10.1145/3416703 (a).
Zubiaga, A.; Liakata, M.; Procter, R.; Bontcheva, K. & Tolmie, P. (2015). Towards detecting rumours in social media. In: Workshops at the Twenty-Ninth AAAI conference on artificial intelligence.
Zubiaga, A.; Liakata, M.; Procter, R.; Wong Sak Hoi, G. & Tolmie, P. (2016). Analysing how people orient to and spread rumours in social media by looking at conversational threads. PloS one, 11(3): e0150989 (b).
CAPTCHA Image