AI search engines often make up citations and answers: Study

AI search engines often make up citations and answers: Study

AI search engines and chatbots often provide wrong answers and make up article citations, according to a new study from Columbia Journalism Review.

Why we care. AI search tools have ramped up the scraping of your content so they can serve answers to their users, often resulting in no clicks to your website. Also, click-through rates from AI search and chatbots are much lower than Google Search, according to a separate, unrelated study. But hallucinating citations makes an already bad situation even worse.

By the numbers. More than half of the responses from Gemini and Grok 3 cited fabricated or broken URLs that led to error pages. Also, according to the study:

  • Overall, chatbots provided incorrect answers to more than 60% of queries:
    • Grok 3 (the highest error rate) answered 94% of the queries incorrectly.
    • Gemini only provided a completely correct response on one occasion (in 10 attempts).
    • Perplexity, which had the lowest error rate, answered 37% of queries incorrectly.

What they’re saying. The study authors (Klaudia Jaźwińska and Aisvarya Chandrasekar), who also noted that “multiple chatbots seemed to bypass Robot Exclusion Protocol preferences,” summed up this way:

“The findings of this study align closely with those outlined in our previous ChatGPT study, published in November 2024, which revealed consistent patterns across chabots: confident presentations of incorrect information, misleading attributions to syndicated content, and inconsistent information retrieval practices. Critics of generative search like Chirag Shah and Emily M. Bender have raised substantive concerns about using large language models for search, noting that they ‘take away transparency and user agency, further amplify the problems associated with bias in [information access] systems, and often provide ungrounded and/or toxic answers that may go unchecked by a typical user.’” 

About the comparison. This analysis of 1,600 queries compared the ability of generative AI tools (ChatGPT search, Perplexity, Perplexity Pro, DeepSeek search, Microsoft CoPilot, xAI’s Grok-2 and Grok-3 search, and Google Gemini) to identify an article’s headline, original publisher, publication date, and URL, based on direct excerpts of 10 articles chosen at random from 20 publishers.

The study. AI Search Has A Citation Problem

About The Author

ADMINI
ALWAYS HERE FOR YOU

CONTACT US

Feel free to contact us and help you at our very best.