Identifying Deliberate Misinformation with Text Analytics

This SSHRC-funded research (in 2015-2019) has enabled creation of the News Verification Browser, a suite of software applications that automatically identify deliberately deceptive or misleading information in online news. The resulting deception detection methodology allows making predictions about each previously unseen news piece: is it likely to belong to the truthful or deceptive category?



BACKGROUND

Digital deception is a deliberate effort to create false beliefs or conclusions in technology-mediated environments. Our research project focuses on deliberate misinformation in text-based online news, provided via mainstream media and citizen journalist websites, news archives and aggregators. Various deception types and degrees will be examined, categorized, and modeled: fake or fabricated news, exaggerated claims, material fact omissions, indirect responses, question-dodging, and subject-changing.

Mistaking deceptive news for authentic reports can create costly negative consequences such as sudden stock fluctuations or reputation loss. Everyday life decision-making, behavior, and mood are influenced by news we receive. When professional analysts sift through the news, their future forecasts, fact and pattern discovery depend on veracity of the news in “big data” knowledge management and curation areas (specifically, in business intelligence, financial and stock market analysis, or national security and law enforcement). In both lay and professional contexts of news consumption, it is critical to distinguish truthful reports from deceptive ones. However, few news verification mechanisms currently exist, and the sheer volume of the information requires novel automated approaches.

Mistaking deceptive news for authentic reports can create costly negative consequences such as sudden stock fluctuations or reputation loss. Everyday life decision-making, behavior, and mood are influenced by news we receive. When professional analysts sift through the news, their future forecasts, fact and pattern discovery depend on veracity of the news in “big data” knowledge management and curation areas (specifically, in business intelligence, financial and stock market analysis, or national security and law enforcement). In both lay and professional contexts of news consumption, it is critical to distinguish truthful reports from deceptive ones. However, few news verification mechanisms currently exist, and the sheer volume of the information requires novel automated approaches.

IMPORTANCE

News verification methods and tools are timely and beneficial to both lay and professional text-based news consumers. The research significance is four-fold:
1) Automatic analytical methods complement and enhance the notoriously poor human ability to discern information from misinformation.
2) Credibility assessment of digital news sources is improved.
3) The mere awareness of potential digital deception constitutes part of new media literacy and can prevent undesirable consequences.
4) The proposed veracity/deception criterion is also seen as a metric for information quality assessment.

OUTCOMES

A system, based on this methodology, alerts users to most likely deceptive or misleading news (e.g., falsifications, satirical fakes, and clickbait) on the website and prompt the users to fact-check further.

This project resulted in sharable outcomes: “Tools-to-Go” is the software we share via GitHub, “Data-to-Go” are databases of training and testing data available to the scientific research community and the broader interested public. These resources are meant for experimentation, validation of our results, and general perusal. You will find our publications in recent scientific conferences proceeding, academic journals, and in the media, as we disseminate what we have learned in the process.