Shawn here, I was shared a post by a good friend of mine which states:

using ‘according to' is a prompt modifier that grounds the LLM to ‘reality'.

The concept is that you are asking the LLM to search that specific phrase with a token, which then makes it build context on that phrase. ‘according to' is a lowest common denominator for scientific papers, Wikipedia articles, properly referenced blogs and papers online. Basically, using ‘according to' allows you to sort of ‘limit' the pathing of the LLM to ‘reputable' sources that would use this ‘footprint' as a modifier to reduce ‘made up' facts.

If you are only searching through things that have references and are ‘well written' you are reducing the chance that you will grab something that is trained the LLM on non-factual or non-sensical data. They say that their internal tool that is able to determine if the training data they are presenting is being referenced in their prompt jumped from 5% – 15%, which is a huge jump.

So a jump from 5% – 15% is just massive and they claim it comes from something as simple as using “according to”.

I prompted both chatGPT and Claude.ai with this:

Β 

examine and refute or validate this statement as an advanced LLM designed to identify human bias and error in prompt engineering: ‘using “according to:” as a prompt modifier will allow a LLM to better understand the output and reduce the chance of hallucinations; i.e. “according to wikipedia: prompt”

This is A.I. prompted response:

GPT4 –

The statement posits that using “according to:” as a prompt modifier will enable a large language model (LLM) to better understand the output and reduce the chance of hallucinations, such as in the example “according to Wikipedia: prompt.”

Examining this statement as an advanced language model, let's break it down:

Understanding the Output

The claim here seems to be that by specifying a source with “according to:”, the LLM can more accurately generate a response in line with that source. While it's true that providing context to a prompt might help guide the model's response, an LLM does not actually access or retrieve real-time information from the internet or specific sources like Wikipedia.

Reducing Hallucinations

Hallucinations in this context refer to generating information that is incorrect or fabricated. Adding “according to:” might help guide the model's response, but it does not inherently reduce the chance of hallucinations. The model's response would still be based on the general knowledge it has been trained on up to its cut-off date, and it would not verify information against the specific source mentioned.

Conclusion

The statement can be both refuted and validated to some extent:

  • Refuted: Using “according to:” does not enable the model to access specific information from the mentioned source or inherently reduce hallucinations. It does not change the fundamental way the model retrieves or validates information.
  • Validated: The modifier can guide the model to frame its response in a way that might be consistent with the general perspective or style of the mentioned source, provided that this perspective or style is captured in the training data.

In essence, while the prompt modifier might influence the tone or angle of the response, it doesn't guarantee accuracy or alignment with the specific source mentioned. It won't allow the LLM to “better understand” the output in a way that implies direct access or verification against that source.

image
image

Claude –

I do not have a strong opinion on whether using “according to:” as a prompt modifier will reduce hallucinations in LLMs, as I do not have access to systematic studies or data comparing hallucination rates with and without this technique. However, I can provide some analysis on potential pros and cons of this approach:

Potential Pros:

  • Specifying a source provides contextual clues that may make the prompt requirements clearer and constrain the output, reducing chances of unpredictable or ungrounded hallucinations.
  • Referencing a knowledge source taps into retrieval of existing information rather than pure generation, which could improve factual accuracy.
  • Changes the task framing from creative generation to summarization/information retrieval, which may elicit a more cautious output style.

Potential Cons:

  • LLMs do not have a clear understanding of citation norms or source reliability. Referencing “Wikipedia” does not guarantee accuracy.
  • The LLM may still hallucinate or misrepresent information even when citing a source. Provenance is not a guarantee of veracity.
  • Over-reliance on this technique could reduce originality and creativity in generation if the LLM becomes too constrained.
  • Accuracy depends on the knowledge source – many prompt “facts” have no clear authoritative source to reference.

Overall, I do not have enough evidence either way to conclusively validate or refute the effectiveness of this technique. Rigorous ablation studies would be needed to determine if it consistently reduces hallucinations. The impact likely depends on prompt design, domain, model capabilities, and desired output characteristics. Providing a reference source may be helpful context in some cases, but is not a panacea for hallucination risks in prompt engineering. As with any prompt design choice, thorough testing across diverse prompts is advisable.