Datasets for Share. We have released several datasets that we collected, annotated, and used for the R&D of the three automated detectors in the News Verification Browser.
Satirical Fake and Legitimate News Dataset (2016) is available.
Satirical Fake and Legitimate News Dataset is an extended version of 2016 and will be released shortly.
Related work: Rubin, V. L., Conroy, N. J., Chen, Y., & Cornwell, S. (2016). Fake News or Truth? Using Satirical Cues to Detect Potentially Misleading News. In The Proceedings of the Workshop on Computational Approaches to Deception Detection at 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-CADD2016), June 17-18, San Diego, California
Falsified and Legitimate Political News Dataset (2016-2017) is available.
Related work: Asubiaro, Toluwase and Rubin, Victoria L. (2018) “Comparing Features of Fabricated and Legitimate Political News in Digital Environments (2016-2017)“, In the Proceedings for the Annual Meeting of the Association for Information Science and Technology (ASIS&T): Building and Sustaining an Ethical Future with Emerging Technology, November 10-14, 2019, Vancouver, Canada
Native Ads and Editorials Dataset (2018-2019) will be released shortly.
Related work: Cornwell, Sarah L. and Rubin, Victoria L. (2019) “What Am I Reading?: Article-style Native Advertisements in Canadian Newspapers,” In the Proceedings of The Annual Hawaii International Conference on System Sciences (HICSS-52) [Coming out shortly] , 7-11 January, 2019, Maui, Hawaii.
Clickbait Dataset recombines 2 well-known datasets.
Related work: Brogly, C. & Rubin, V. L. ((2019) Detecting Clickbait: Here’s How to Do It / Comment détecter les pièges à clic. Canadian Journal of Information and Library Science, 42(3-4):154-175