Can we trust AI citations?

Paul Newbury6 March 2023

There’s an age-old adage that ‘on the internet, nobody knows you’re a dog.’ It’s a brilliant, if slightly abstract, summation of everything that makes any information on the internet live in this unique space where it can be potentially top-tier professional insight, or absolute nonsense.

In the age of disinformation and ‘Fake News’, how do you trust the source of the information that you are being provided with?

Google’s solution to this issue was initially backlinks, essentially the more backlinks a website or blog received, the more trustworthy it was deemed. Google has now refined this algorithm to assign value to backlinks, whereby a link from a trusted source like the BBC is worth more value than a newly created website. Google has since gone further, pushing EEAT (Experience, Expertise, Authoritativeness, and Trustworthiness), Helpful Content updates and their search quality rater guidelines, which help fine-tune their algorithms. Google now advise:

“Consider the extent to which the content creator has the necessary first-hand or life experience for the topic. Many types of pages are trustworthy and achieve their purpose well when created by people with a wealth of personal experience. For example, which would you trust: a product review from someone who has personally used the product or a “review” by someone who has not?”

Which then begs the question; why should we trust an AI that has no lived experience?

The importance of citations

We’re taught from an early age to always provide the source of our information. This lets others critically analyse how you have come to your conclusion and which sources you’ve used to back up your arguments. It lets us question, why have you used a study from 1999 to back up a point about the modern web, or why have you chosen to trust a website that has no sources of its own versus a peer-reviewed paper that says the opposite?

It also avoids the issue of plagiarism and gives credit to the source of those ideas. If I’ve spent time crafting a study or working on an informational piece, I’d hate for someone to come along copy my work, change a few words, and call it their original thoughts.

How does this idea currently apply to search engines? Let’s say you want to find the ‘best pizza places in Edinburgh’, a simple Google search will provide 12.3M results, that’s a lot of data!

If you trust Google Reviews, then a click onto Google Maps allows you to explore the local area, in most cases each review is written by a real person who has experienced the restaurant and appraised its range of toppings. Similarly, aggregated review sites have combined the data of millions of reviews to pull together a list of the top pizza places.

Those with a more refined pizza palette may only trust independent critics in trusted sources such as newspapers, or sites such as VisitScotland, which even goes as far to offer the places recommended by an actual Italian! Following these are independent bloggers who talk of the places they’ve visited and why in their opinion it comes recommended.

Pizza choice is of course a subjective opinion, but each site has either visited these places or provided as a source those reviewers who have. Would you trust an AI alternative, who’s never physically visited Edinburgh and who’s incapable of eating pizza?

No citations needed?

To compare, we posed the same question to ChatGPT:

Great! We have a list of places that it recommends and a small comment about each. We haven’t needed to wade through a bunch of two-thousand-word articles about the intricacies of judging pizza, and we have a list in seconds. However, how is this data decided upon? We don’t actually know where this AI model got its inspiration, so let’s ask it.

With this response, we can see that we’re being asked to trust ChatGPT’s ‘knowledge of Edinburgh’s dining scene’, which is essentially a judgement on whether the AI language model has done a good job parsing the millions of data points without direct sources and without the ‘lived experience.’

While trusting ChatGPT and ending up with a dodgy pizza won’t cause untold harm – and for the record, our Edinburgh-based staff have tried many of those pizzerias and can attest to their quality - Google’s AI language model, ‘Bard’, took a similar approach to not providing sources in a livestreamed AI test, and the technically incorrect answer about the James Webb telescope has been reported to have cost them $100bn in market value.

Attention-grabbing headlines aside, if they had provided the source of this information, would it have helped them save face?

From an SEO point of view, we accept that instant answers and featured snippets can occasionally pull in inaccurate information. For example, simply Google ‘how tall is Peppa Pig?’ and enjoy a laugh. However, the search results then provide multiple answers below this, each with the source of this information, so we can look at alternative sources. Google’s Search algorithms have placed more and more emphasis on useful content, EEAT, etc. but we’re expected to just trust their AI explicitly without some sort of evidence?

Coming back to the pizza example, if I’m now running the pizza restaurant and I want to rank in these top pizza lists on search, I know that I need reviews from customers posted across certain aggregator sites, I need to invite independent reviewers to review my restaurant and I need to improve my online presence by creating content that shows my expertise in pizza-making.

If Google’s Bard is the future of search, then without sources, how do I know how to end up on these lists? Additionally, as a popular pizza reviewer, what is the point in creating a top ten list if Bard just takes my answer without crediting my thoughts or driving additional traffic to my blog?

Some German publishers are already asking for royalties for answers provided by AI. As many SEO industry experts, (such as Danny Goodwin and Barry Schwartz) have pointed out, this just turns Google into a scraper site and potentially results in a recursive issue where there’s no incentive to create new content, so there’s nothing produced for the AI to be trained on.

Citations, please

What’s the alternative future to Google’s approach? We’ve seen multiple attempts to demonstrate a viable alternative from You and Bing’s AI Powered Search Engines, which instead provide AI generated answers but with the data sources this information is taken from acting as citations. This provides more transparency and visibility on which sites are being used as data sources.

In the pizza example, this benefits the user, as they can critically analyse and appraise the sources of information or click off the AI result onto other websites to find out more on the individual reasons why these particular restaurants were picked.

It benefits the pizza reviewers, as they now have incentive again to provide the best researched answers and can still expect some traffic if their reviews are cited. Additionally, it benefits the pizza restaurants mentioned as they again know the reasons why their restaurant was mentioned or excluded from the results and can go back to improving their online presence through the methods mentioned previously. This future of AI powered search is an exciting prospect, as while you may not be able to provide all the answers on your own site for you customer’s niche needs, the response will take the expert information you have provided into account.

SEO, at its heart, is about building authority and trust of your brand or website over time. Yard are strong believers in showcasing a brand’s expertise through expert content and Digital PR campaigns. This process helps brands appear across a wide range of search results and if Bing and You are the future of search then it will help them show as citations for topics where the brands are the subject matter experts.

However, if Google continues to go down the path of not providing citations, then it will set a dangerous precedent of users trusting data without sources and website owners having little to no incentive on providing useful content. We’ll be keeping a close eye on these developments and advising our clients on the best path forward.

The importance of citations

No citations needed?

Citations, please

A guide to SEO rankings

Collaborating with AI for SEO

SGE – What Does It Mean for SEO?

Subscribe for insights & updates straight to your inbox.