NEWS VERIFICATION PROJECT Identifying deliberate misinformation with text analytics.

Our research (2015-2018; Funded by SSHRC) enables the identification of deliberately deceptive misinformation in the institutional mainstream or non-institutional text-based online news. The resulting deception detection methodology will allow making predictions about each previously unseen news piece: is it likely to belong to the truthful or deceptive category? A system, based on this methodology, will alert users to most likely deceptive news in the incoming stream of news, and prompt the users to fact-check further.

Digital deception is a deliberate effort to create false beliefs or conclusions in technology-mediated environments. Our research project focuses on deliberate misinformation in text-based online news, provided via mainstream media and citizen journalist websites, news archives and aggregators. Various deception types and degrees will be examined, categorized, and modeled: fake or fabricated news, exaggerated claims, material fact omissions, indirect responses, question-dodging, and subject-changing.

Mistaking deceptive news for authentic reports can create costly negative consequences such as sudden stock fluctuations or reputation loss. Everyday life decision-making, behavior, and mood are influenced by news we receive. When professional analysts sift through the news, their future forecasts, fact and pattern discovery depend on veracity of the news in “big data” knowledge management and curation areas (specifically, in business intelligence, financial and stock market analysis, or national security and law enforcement). In both lay and professional contexts of news consumption, it is critical to distinguish truthful reports from deceptive ones. However, few news verification mechanisms currently exist, and the sheer volume of the information requires novel automated approaches.

Mistaking deceptive news for authentic reports can create costly negative consequences such as sudden stock fluctuations or reputation loss. Everyday life decision-making, behavior, and mood are influenced by news we receive. When professional analysts sift through the news, their future forecasts, fact and pattern discovery depend on veracity of the news in “big data” knowledge management and curation areas (specifically, in business intelligence, financial and stock market analysis, or national security and law enforcement). In both lay and professional contexts of news consumption, it is critical to distinguish truthful reports from deceptive ones. However, few news verification mechanisms currently exist, and the sheer volume of the information requires novel automated approaches.

News verification methods and tools are timely and beneficial to both lay and professional text-based news consumers. The research significance is four-fold:
1) Automatic analytical methods complement and enhance the notoriously poor human ability to discern information from misinformation.
2) Credibility assessment of digital news sources is improved.
3) The mere awareness of potential digital deception constitutes part of new media literacy and can prevent undesirable consequences.
4) The proposed veracity/deception criterion is also seen as a metric for information quality assessment.

Further Readings:

If you'd like to see what we have written on this topic in the recent year or two more formally, please see these publications:

Rubin, V. L. (In press, projected late 2016 - early 2017) Deception Detection and Rumor Debunking for Social Media. [Abstract available]. Chapter 21. In The SAGE Handbook of Social Media Research Methods, Chapter: Deception Detection and Rumor Debunking for Social Media , Publisher: Sage, Editors: Luke Sloan, Anabel Quan-Haase, pp.342-364

Rubin, V. L., Conroy, N. J., Chen, Y., & Cornwell, S. (2016). Fake News or Truth? Using Satirical Cues to Detect Potentially Misleading News. In The Proceedings of the Workshop on Computational Approaches to Deception Detection at 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-CADD2016), June 17-18, San Diego, California

[The associated data set "Access S-n-L News DB2015-2016" was released for (mostly) academic use to the general public.]

Chen, Y., Conroy, N. J., & Rubin, V. L. (2015). News in an Online World: The Need for an “Automatic Crap Detector”. In The Proceedings of the Association for Information Science and Technology Annual Meeting (ASIST2015), Nov. 6-10, St. Louis.

Chen, Y., Conroy, N. J., & Rubin, V. L. (2015). Misleading Online Content: Recognizing Clickbait as “False News”.  ACM Workshop on Multimodal Deception Detection (WMDD 2015), joint with the International Conference on Multimodal Interaction (ICMI2015), November 9, 2015, Seattle, Washington, USA. http://dl.acm.org/citation.cfm?id=2823467

Conroy, N. J., Chen, Y., & Rubin, V. L. (2015). Automatic Deception Detection: Methods for Finding Fake News. In The Proceedings of the Association for Information Science and Technology Annual Meeting (ASIST2015), Nov. 6-10, St. Louis.

Rubin, V.L. (2014) Pragmatic and Cultural Considerations for Deception Detection in Asian Languages. TALIP Perspectives, Guest Editorial Commentary, 13 (2).

Rubin, V.L. & Conroy, N. (2012). Discerning truth from deception: Human judgments & automation efforts. First Monday 17 (3-5).  dx.doi.org/10.5210/fm.v17i3.3933

Rubin, V.L., Conroy, N., & Chen, Y. (2015). Towards News Verification: Deception Detection Methods for News Discourse. The Rapid Screening Technologies, Deception Detection and Credibility Assessment Symposium, Hawaii International Conference on System Sciences (HICSS48), January 2015.

Rubin, V. L., Chen, Y., and Conroy, N. (2015). Deception Detection for News: Three Types of Fakes. In The Proceedings of the Association for Information Science and Technology Annual Meeting (ASIST2015), Nov. 6-10, St. Louis.

DECEPTION DETECTION (2010-2014)
Victoria Rubin and her graduate students developed methods to distinguish truth from deception in textual data. We used rhetorical structure theory (RST) as the analytic framework to identify systematic differences between deceptive and truthful stories in terms of their coherence and structure. A vector space model (VSM) assesses each story's position in multidimensional RST space with respect to its distance from truthful and deceptive centers as measures of the story's level of deception and truthfulness. The RST-VSM for determining deception demonstrates that the discourse structure analysis as a significant method for automated deception detection and an effective complement to lexico-semantic analysis. The potential is in developing novel discourse-based tools to alert information users to potential deception in computer-mediated texts and social media.