Reduce AI Hallucinations With This Neat Software Trick – WIRED

To start off, not all RAGs are of the same caliber. The accuracy of the content in the custom database is critical for solid outputs, but that isnt the only variable. It's not just the quality of the content itself, says Joel Hron, a global head of AI at Thomson Reuters. It's the quality of the search, and retrieval of the right content based on the question. Mastering each step in the process is critical since one misstep can throw the model completely off.

Any lawyer who's ever tried to use a natural language search within one of the research engines will see that there are often instances where semantic similarity leads you to completely irrelevant materials, says Daniel Ho, a Stanford professor and senior fellow at the Institute for Human-Centered AI. Hos research into AI legal tools that rely on RAG found a higher rate of mistakes in outputs than the companies building the models found.

Which brings us to the thorniest question in the discussion: How do you define hallucinations within a RAG implementation? Is it only when the chatbot generates a citation-less output and makes up information? Is it also when the tool may overlook relevant data or misinterpret aspects of a citation?

According to Lewis, hallucinations in a RAG system boil down to whether the output is consistent with whats found by the model during data retrieval. Though, the Stanford research into AI tools for lawyers broadens this definition a bit by examining whether the output is grounded in the provided data as well as whether its factually correcta high bar for legal professionals who are often parsing complicated cases and navigating complex hierarchies of precedent.

While a RAG system attuned to legal issues is clearly better at answering questions on case law than OpenAIs ChatGPT or Googles Gemini, it can still overlook the finer details and make random mistakes. All of the AI experts I spoke with emphasized the continued need for thoughtful, human interaction throughout the process to double check citations and verify the overall accuracy of the results.

Law is an area where theres a lot of activity around RAG-based AI tools, but the processs potential is not limited to a single white-collar job. Take any profession or any business. You need to get answers that are anchored on real documents, says Arredondo. So, I think RAG is going to become the staple that is used across basically every professional application, at least in the near to mid-term. Risk-averse executives seem excited about the prospect of using AI tools to better understand their proprietary data without having to upload sensitive info to a standard, public chatbot.

Its critical, though, for users to understand the limitations of these tools, and for AI-focused companies to refrain from overpromising the accuracy of their answers. Anyone using an AI tool should still avoid trusting the output entirely, and they should approach its answers with a healthy sense of skepticism even if the answer is improved through RAG.

Hallucinations are here to stay, says Ho. We do not yet have ready ways to really eliminate hallucinations. Even when RAG reduces the prevalence of errors, human judgment reigns paramount. And thats no lie.

See the rest here:

Reduce AI Hallucinations With This Neat Software Trick - WIRED

Related Posts

Comments are closed.