Guarding the Truth: Enhancing Fake Headline Detection using Transformer-Based Encoding and Deep Learning Methods

Mohammed Alghobiri

Abstract


Identifying fake news headlines is important in combating misinformation and remains an active research domain in Natural Language Processing (NLP). Traditional text encodings like CountVectorization and Term Frequency Times Inverse Document Frequency (TF-IDF) have limitations in capturing context and semantic information, leading to suboptimal performance in complex NLP tasks. This research introduces an approach utilizing sentence transformers to produce sentence embeddings that preserve both the semantic meaning and contextual information within the text. We aim to identify false news headlines by utilizing an array of deep learning models such as LSTM, BiLSTM, BERT, DistilBERT, and RoBERTa in conjunction with diverse embeddings including TF-IDF, GloVe, fastText, and sentence transformers, complemented by various machine learning algorithms like naïve Bayes, decision trees, random forest, and AdaBoost. The experiments, conducted on the Artificial Intelligence (AI) Open news headlines dataset, reveal that sentence transformers consistently outperform conventional encodings, demonstrating higher accuracy and F1 scores. Among the deep learning models, RoBERTa achieved the highest accuracy, reaching 95.48% with GloVe embeddings and 96.17% with sentence transformers. These empirical findings underline the superiority of our proposed approach over existing methods, offering valuable insights for effectively identifying fake headlines in news articles.

Full Text:

PDF

References


O. Stitini, S. Kaloun, and O. Bencharef, “Towards the Detection of Fake News on Social Networks Contributing to the Improvement of Trust and Transparency in Recommendation Systems: Trends and Challenges,” Inf., vol. 13, no. 3, p. 128, 2022, doi: 10.3390/info13030128.

M. Luo, J. T. Hancock, and D. M. Markowitz, “Credibility Perceptions and Detection Accuracy of Fake News Headlines on Social Media: Effects of Truth-Bias and Endorsement Cues,” Communic. Res., vol. 49, no. 2, pp. 171–195, 2022, doi: 10.1177/0093650220921321.

E. Shushkevich, M. Alexandrov, and J. Cardiff, “BERT-based Classifiers for Fake News Detection on Short and Long Texts with Noisy Data: A Comparative Analysis,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer, 2022, pp. 263–274. doi: 10.1007/978-3-031-16270-1_22.

S. Sheikhi, “An effective fake news detection method using WOA-xgbTree algorithm and content-based features,” Appl. Soft Comput., vol. 109, p. 107559, 2021, doi: 10.1016/j.asoc.2021.107559.

S. R. Sahoo and B. B. Gupta, “Multiple features based approach for automatic fake news detection on social networks using deep learning,” Appl. Soft Comput., vol. 100, p. 106983, 2021, doi: 10.1016/j.asoc.2020.106983.

M. Amjad, G. Sidorov, A. Zhila, H. Gómez-Adorno, I. Voronkov, and A. Gelbukh, “‘Bend the truth’: Benchmark dataset for fake news detection in Urdu language and its evaluation,” J. Intell. Fuzzy Syst., vol. 39, no. 2, pp. 2457–2469, 2020, doi: 10.3233/JIFS-179905.

F. Fifita, J. Smith, M. B. Hanzsek-Brill, X. Li, and M. Zhou, “Machine Learning-Based Identifications of COVID-19 Fake News Using Biomedical Information Extraction,” Big Data Cogn. Comput., vol. 7, no. 1, p. 46, 2023, doi: 10.3390/bdcc7010046.

X. Wang, P. Zhao, and X. Chen, “Fake news and misinformation detection on headlines of COVID-19 using deep learning algorithms,” Int. J. Data Sci., vol. 5, no. 4, p. 316, 2020, doi: 10.1504/ijds.2020.115873.

T. Felber, “Constraint 2021: Machine Learning Models for COVID-19 Fake News Detection Shared Task,” arXiv Prepr. arXiv2101.03717, 2021, [Online]. Available: http://arxiv.org/abs/2101.03717

B. Wang, Y. Feng, X. cai Xiong, Y. heng Wang, and B. hua Qiang, “Multi-modal transformer using two-level visual features for fake news detection,” Appl. Intell., vol. 53, no. 9, pp. 10429–10443, 2023, doi: 10.1007/s10489-022-04055-5.

L. Ying, H. Yu, J. Wang, Y. Ji, and S. Qian, “Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection,” IEEE Access, vol. 9, pp. 132363–132373, 2021, doi: 10.1109/ACCESS.2021.3114093.

N. Capuano, G. Fenza, V. Loia, and F. D. Nota, “Content-Based Fake News Detection With Machine and Deep Learning: a Systematic Review,” Neurocomputing, vol. 530, pp. 91–103, 2023, doi: 10.1016/j.neucom.2023.02.005.

H. Jwa, D. Oh, K. Park, J. M. Kang, and H. Lim, “exBAKE: Automatic fake news detection model based on Bidirectional Encoder Representations from Transformers (BERT),” Appl. Sci., vol. 9, no. 19, p. 4062, 2019, doi: 10.3390/app9194062.

A. Kumar, S. Saumya, and J. P. Singh, “NITP-AI-NLP@UrduFake-FIRE2020: Multi-layer dense neural network for fake news detection in urdu news articles,” in CEUR Workshop Proceedings, 2020, pp. 458–463.

H. Ahmed, I. Traore, and S. Saad, “Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer, 2017, pp. 127–138. doi: 10.1007/978-3-319-69155-8_9.

A. Jain, A. Shakya, H. Khatter, and A. K. Gupta, “A smart System for Fake News Detection Using Machine Learning,” in IEEE International Conference on Issues and Challenges in Intelligent Computing Techniques, ICICT 2019, IEEE, 2019, pp. 1–4. doi: 10.1109/ICICT46931.2019.8977659.

A. P. S. Bali, M. Fernandes, S. Choubey, and M. Goel, “Comparative Performance of Machine Learning Algorithms for Fake News Detection,” in Communications in Computer and Information Science, Springer, 2019, pp. 420–430. doi: 10.1007/978-981-13-9942-8_40.

R. K. Kaliyar, A. Goswami, P. Narang, and S. Sinha, “FNDNet – A deep convolutional neural network for fake news detection,” Cogn. Syst. Res., vol. 61, pp. 32–44, 2020, doi: 10.1016/j.cogsys.2019.12.005.

A. Wani, I. Joshi, S. Khandve, V. Wagh, and R. Joshi, “Evaluating Deep Learning Approaches for Covid19 Fake News Detection,” in Communications in Computer and Information Science, Springer, 2021, pp. 153–163. doi: 10.1007/978-3-030-73696-5_15.

M. Umer, Z. Imtiaz, S. Ullah, A. Mehmood, G. S. Choi, and B. W. On, “Fake news stance detection using deep learning architecture (CNN-LSTM),” IEEE Access, vol. 8, pp. 156695–156706, 2020, doi: 10.1109/ACCESS.2020.3019735.

R. Sepúlveda-Torres, M. Vicente, E. Saquete, E. Lloret, and M. Palomar, “Exploring Summarization to Enhance Headline Stance Detection,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer, 2021, pp. 243–254. doi: 10.1007/978-3-030-80599-9_22.

R. K. Kaliyar, A. Goswami, and P. Narang, “FakeBERT: Fake news detection in social media with a BERT-based deep learning approach,” Multimed. Tools Appl., vol. 80, no. 8, pp. 11765–11788, 2021, doi: 10.1007/s11042-020-10183-2.

A. Hande, K. Puranik, R. Priyadharshini, S. Thavareesan, and B. R. Chakravarthi, “Evaluating Pretrained Transformer-based Models for COVID-19 Fake News Detection,” in Proceedings - 5th International Conference on Computing Methodologies and Communication, ICCMC 2021, IEEE, 2021, pp. 766–772. doi: 10.1109/ICCMC51019.2021.9418446.

V. I. Ilie, C. O. Truica, E. S. Apostol, and A. Paschke, “Context-Aware Misinformation Detection: A Benchmark of Deep Learning Architectures Using Word Embeddings,” IEEE Access, vol. 9, pp. 162122–162146, 2021, doi: 10.1109/ACCESS.2021.3132502.

M. Szczepański, M. Pawlicki, R. Kozik, and M. Choraś, “New explainability method for BERT-based model in fake news detection,” Sci. Rep., vol. 11, no. 1, p. 23705, 2021, doi: 10.1038/s41598-021-03100-6.

M. A. Bsoul, A. Qusef, and S. Abu-Soud, “Building an Optimal Dataset for Arabic Fake News Detection,” Procedia Comput. Sci., vol. 201, no. C, pp. 665–672, 2022, doi: 10.1016/j.procs.2022.03.088.

A. M. Ali, F. A. Ghaleb, B. A. S. Al-Rimy, F. J. Alsolami, and A. I. Khan, “Deep Ensemble Fake News Detection Model Using Sequential Deep Learning Technique,” Sensors, vol. 22, no. 18, p. 6970, 2022, doi: 10.3390/s22186970.

B. Palani, S. Elango, and K. Vignesh Viswanathan, “CB-Fake: A multimodal deep learning framework for automatic fake news detection using capsule neural network and BERT,” Multimed. Tools Appl., vol. 81, no. 4, pp. 5587–5620, 2022, doi: 10.1007/s11042-021-11782-3.

I. K. Sastrawan, I. P. A. Bayupati, and D. M. S. Arsa, “Detection of fake news using deep learning CNN–RNN based methods,” ICT Express, vol. 8, no. 3, pp. 396–408, 2022, doi: 10.1016/j.icte.2021.10.003.

M. Fayaz, A. Khan, M. Bilal, and S. U. Khan, “Machine learning for fake news classification with optimal feature selection,” Soft Comput., vol. 26, no. 16, pp. 7763–7771, 2022, doi: 10.1007/s00500-022-06773-x.

L. Huang, “Deep Learning for Fake News Detection: Theories and Models,” ACM Int. Conf. Proceeding Ser., pp. 1322–1326, 2022, doi: 10.1145/3573428.3573663.

N. Rai, D. Kumar, N. Kaushik, C. Raj, and A. Ali, “Fake News Classification using transformer based enhanced LSTM and BERT,” Int. J. Cogn. Comput. Eng., vol. 3, pp. 98–105, 2022, doi: 10.1016/j.ijcce.2022.03.003.

C. O. Truică, E. S. Apostol, and A. Paschke, “Awakened at CheckThat! 2022: Fake News Detection using BiLSTM and Sentence Transformer,” CEUR Workshop Proc., vol. 3180, pp. 749–757, 2022.

C. O. Truică and E. S. Apostol, “MisRoBÆRTa: Transformers versus Misinformation,” Mathematics, vol. 10, no. 4, p. 569, 2022, doi: 10.3390/math10040569.

J. Yin, M. Gao, K. Shu, Z. Zhao, Y. Huang, and J. Wang, “Emulating Reader Behaviors for Fake News Detection,” arXiv Prepr. arXiv2306.15231, 2023, [Online]. Available: http://arxiv.org/abs/2306.15231

M. I. Nadeem et al., “SSM: Stylometric and semantic similarity oriented multimodal fake news detection,” J. King Saud Univ. - Comput. Inf. Sci., vol. 35, no. 5, p. 101559, 2023, doi: 10.1016/j.jksuci.2023.101559.

A. Agarwal, S. Mishra, and S. Ahmad, “Fake News Detection Using Machine Learning,” in Lecture Notes in Electrical Engineering, IEEE, 2023, pp. 51–59. doi: 10.1007/978-981-99-5358-5_4.

C. O. Truică and E. S. Apostol, “It’s All in the Embedding! Fake News Detection Using Document Embeddings,” Mathematics, vol. 11, no. 3, p. 508, 2023, doi: 10.3390/math11030508.

J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global vectors for word representation,” in EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 2014, pp. 1532–1543. doi: 10.3115/v1/d14-1162.


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность MoNeTec 2024

ISSN: 2307-8162