‘I’m unable to’: how generative AI chatbots respond when asked for the latest news

We test how well ChatGPT and Bard (now Gemini) provide the latest news to users who ask for the top five headlines from specific outlets. Based on analysis of 4,500 headline requests (in 900 outputs) in January/February 2024, we find that (i) ChatGPT returned non-news output 52–54% of the time (an ‘...

Fuld beskrivelse

Bibliografiske detaljer
Main Authors:	Fletcher, R, Adami, M, Nielsen, RK
Format:	Report
Sprog:	English
Udgivet:	Reuters Institute for the Study of Journalism 2024

_version_	1826312902680772608
author	Fletcher, R Adami, M Nielsen, RK
author_facet	Fletcher, R Adami, M Nielsen, RK
author_sort	Fletcher, R
collection	OXFORD
description	We test how well ChatGPT and Bard (now Gemini) provide the latest news to users who ask for the top five headlines from specific outlets. Based on analysis of 4,500 headline requests (in 900 outputs) in January/February 2024, we find that (i) ChatGPT returned non-news output 52–54% of the time (an ‘I’m unable to’ message), while Bard did this 95% of the time. (ii) For ChatGPT, just 8–10% of requests returned headlines referring to top stories on the outlet’s homepage, and (iii) 30% returned headlines that referred to real, existing stories that were not among the top stories. (iv) 3% of ChatGPT outputs contained headlines that referred to real stories that could only be found on the website of a different outlet and 3% were so vague and ambiguous that they could not be matched to existing stories – both of which could be considered a form of hallucination.
first_indexed	2024-09-25T04:02:27Z
format	Report
id	oxford-uuid:d272920d-a6eb-41ad-867b-24c583bf0a60
institution	University of Oxford
language	English
last_indexed	2024-09-25T04:02:27Z
publishDate	2024
publisher	Reuters Institute for the Study of Journalism
record_format	dspace
spelling	oxford-uuid:d272920d-a6eb-41ad-867b-24c583bf0a602024-05-01T15:40:41Z‘I’m unable to’: how generative AI chatbots respond when asked for the latest newsReporthttp://purl.org/coar/resource_type/c_93fcuuid:d272920d-a6eb-41ad-867b-24c583bf0a60EnglishSymplectic ElementsReuters Institute for the Study of Journalism2024Fletcher, RAdami, MNielsen, RKWe test how well ChatGPT and Bard (now Gemini) provide the latest news to users who ask for the top five headlines from specific outlets. Based on analysis of 4,500 headline requests (in 900 outputs) in January/February 2024, we find that (i) ChatGPT returned non-news output 52–54% of the time (an ‘I’m unable to’ message), while Bard did this 95% of the time. (ii) For ChatGPT, just 8–10% of requests returned headlines referring to top stories on the outlet’s homepage, and (iii) 30% returned headlines that referred to real, existing stories that were not among the top stories. (iv) 3% of ChatGPT outputs contained headlines that referred to real stories that could only be found on the website of a different outlet and 3% were so vague and ambiguous that they could not be matched to existing stories – both of which could be considered a form of hallucination.
spellingShingle	Fletcher, R Adami, M Nielsen, RK ‘I’m unable to’: how generative AI chatbots respond when asked for the latest news
title	‘I’m unable to’: how generative AI chatbots respond when asked for the latest news
title_full	‘I’m unable to’: how generative AI chatbots respond when asked for the latest news
title_fullStr	‘I’m unable to’: how generative AI chatbots respond when asked for the latest news
title_full_unstemmed	‘I’m unable to’: how generative AI chatbots respond when asked for the latest news
title_short	‘I’m unable to’: how generative AI chatbots respond when asked for the latest news
title_sort	i m unable to how generative ai chatbots respond when asked for the latest news
work_keys_str_mv	AT fletcherr imunabletohowgenerativeaichatbotsrespondwhenaskedforthelatestnews AT adamim imunabletohowgenerativeaichatbotsrespondwhenaskedforthelatestnews AT nielsenrk imunabletohowgenerativeaichatbotsrespondwhenaskedforthelatestnews

‘I’m unable to’: how generative AI chatbots respond when asked for the latest news

Lignende værker