SENS-HEAD: A Machine Learning Framework for Sensationalism Detection in News Headlines Using Linguistic and Semantic Features

Authors

  • Po-Hsuan Chang
  • Akshi Kumar
  • Saurabh Raj Sangwan

DOI:

https://doi.org/10.37745/bjmas.2022.04909

Abstract

The proliferation of sensationalized news headlines has raised concerns about media integrity, necessitating automated approaches for detecting sensationalism beyond traditional clickbait classification. This study presents SENS-HEAD, a novel dataset comprising over 30,000 annotated headlines labelled for sensational content and emotional arousal. Employing Natural Language Processing (NLP), we extract a diverse set of linguistic and semantic features, including sentiment polarity, syntactic complexity, punctuation distribution, and stop word ratio, to systematically distinguish sensational from non-sensational headlines. We implement ensemble learning models—XGBoost, CATBoost, and Random Forest achieving a balanced F1-score of 0.66. To enhance interpretability, we integrate SHAP (SHapley Additive exPlanations), unveiling key predictive markers such as stop word frequency, headline length, and sentiment extremity. The findings not only advance explainable AI (XAI) for sensationalism detection but also provide practical applications in automated journalism, content moderation, and media ethics regulation. By strengthening computational linguistics with ethical AI, this research delivers actionable insights for policymakers and promotes trustworthy news dissemination in the digital era.

 

Downloads

Download data is not yet available.

Downloads

Published

01-06-2025 — Updated on 01-06-2025

Versions

How to Cite

Chang , P.-H., Kumar, A., & Sangwan, S. R. (2025). SENS-HEAD: A Machine Learning Framework for Sensationalism Detection in News Headlines Using Linguistic and Semantic Features. British Journal of Multidisciplinary and Advanced Studies, 6(3), 1–31. https://doi.org/10.37745/bjmas.2022.04909