Repository logo
  • English
  • Deutsch
  • Español
  • Français
  • Log In
    New user? Click here to register.Have you forgotten your password?

  • English
  • Deutsch
  • Español
  • Français
  • Log In
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • Research Outputs
  • Fundings & Projects
  • Researchers
  • Statistics
  1. Home
  2. Current Research Information System UV
  3. Publicaciones
  4. An Empiric Validation Of Linguistic Features In Machine Learning Models For Fake News Detection
 
  • Details
Options

An Empiric Validation Of Linguistic Features In Machine Learning Models For Fake News Detection

Journal
Data & Knowledge Engineering
Date Issued
2023-08-02
Author(s)
Eduardo Puraivan
René Venegas
Riquelme, Fabián  
Facultad de Ingeniería  
DOI
10.1016/j.datak.2023.102207
WoS ID
WOS:001059000600001
Abstract
The diffusion of fake news is a growing problem with a high and negative social impact. There are several approaches to address the detection of fake news. This work focuses on a hybrid approach based on functional linguistic features and machine learning. There are several recent works with this approach. However, there are no clear guidelines on which linguistic features are most appropriate nor how to justify their use. Furthermore, many classification results are modest compared to recent advances in natural language processing. Our proposal considers 88 features organized in surface information, part of speech, discursive characteristics, and readability indices. On a 42 677 news database, we show that the classification results outperform previous work, even outperforming state-of-the-art techniques such as BERT, reaching 99.99% accuracy. A proper selection of linguistic features is crucial for interpretability as well as the performance of the models. In this sense, our proposal contributes to the intentional selection of linguistic features, overcoming current technical issues. We identified 32 features that show differences between the type of news. The results are highly competitive in the classification and simple to implement and interpret.
Subjects

Computer Science, Art...

Computer Science, Inf...

Information Systems A...

OCDE Subjects

Social Sciences::Medi...

Quartile (Date Issued)
Q3
License
acceso restringido

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback

Hosting & Support by

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science