Reduce AI Hallucinations With This Neat Software Trick

Real Hacker StaffJune 14, 2024

9 2 minutes read

To start off, not all RAGs are of the same caliber. The accuracy of the content in the custom database is critical for solid outputs, but that isn’t the only variable. “It’s not just the quality of the content itself,” says Joel Hron, a global head of AI at Thomson Reuters. “It’s the quality of the search, and retrieval of the right content based on the question.” Mastering each step in the process is critical, since one misstep can throw the model completely off.

“Any lawyer who’s ever tried to use a natural language search within one of the research engines will see that there are often instances where semantic similarity leads you to completely irrelevant materials,” says Daniel Ho, a Stanford professor and senior fellow at the institute for Human-Centered AI. Ho’s research into AI legal tools that rely on RAG found a higher rate of mistakes in outputs than the companies building the models found.

Which brings us to the thorniest question in the discussion: how do you define hallucinations within a RAG implementation? Is it only when the chatbot generates a citation-less output and makes up information? Is it also when the tool may overlook relevant data or misinterpret aspects of a citation?

According to Lewis, hallucinations in a RAG system boil down to whether the output is consistent with what’s found by the model during data retrieval. Though, the Stanford research into AI tools for lawyers broadens this definition a bit by examining whether the output is grounded in the provided data as well as whether it’s factually correct—a high bar for legal professionals who are often parsing complicated cases and navigating complex hierarchies of precedent.

While a RAG system attuned to legal issues is clearly better at answering questions on case law than OpenAI’s ChatGPT or Google’s Gemini, it can still overlook the finer details and make random mistakes. All of the AI experts I spoke with emphasized the continued need for thoughtful, human interaction throughout the process to double check citations and verify the overall accuracy of the results.

Law is an area where there’s a lot of activity around RAG-based AI tools, but the process’s potential is not limited to a single, white collar job. “Take any profession or any business. You need to get answers that are anchored on real documents,” says Arredondo. “So, I think RAG is going to become the staple that is used across basically every professional application, at least in the near to mid-term.” Risk-averse executives seem excited about the prospect of using AI tools to better understand their proprietary data, without having to upload sensitive info to a standard, public chatbot.

It’s critical, though, for users to understand the limitations of these tools, and for AI-focused companies to refrain from overpromising the accuracy of their answers. Anyone using an AI tool should still avoid trusting the output entirely, and they should approach its answers with a healthy sense of skepticism even if the answer is improved through RAG.

“Hallucinations are here to stay,” says Ho. “We do not yet have ready ways to really eliminate hallucinations.” Even when RAG reduces the prevalence of errors, human judgment reigns paramount. And that’s no lie.

Source link

Haliey Welch explains why she won’t thank Tim & Dee TV for Hawk Tuah success

Partnership-Building for African Non-Profits — Global Issues

SmallRig Black Friday Deals Are Here

Honor 300 gets listed with all colors and memory options

World reacts to ICC arrest warrants for Israel’s Netanyahu, Gallant | Israel-Palestine conflict News

Doctors Without Borders Halts Operations in Haiti Amid Threats from — Global Issues

5 Great Thanksgiving Apps

Belkin Auto-Tracking Stand Pro With DockKit Review: Hands-Free Fun

Google calls out the DOJ for its ‘extreme’ plan to break up its search business

Samsung Galaxy Watch FE hits lowest price yet

Reduce AI Hallucinations With This Neat Software Trick

Real Hacker Staff

NYT Strands today — hints, answers and spangram for Tuesday, May 28 (game #86)

Sonos Ace review: Strong start with a home theater push

Sennheiser HD 490 Pro Plus — SonicScoop

Fiio M23 review: This mid-range digital audio player has all the right upgrades

Sennheiser Sets the Bar High for InfoComm 2024

Haliey Welch explains why she won’t thank Tim & Dee TV for Hawk Tuah success

The 25 Best Cities To Buy a Home for Less Than $500K

Iron Butterfly ‘In-a-Gadda-Da-Vida’ Singer Was 78

Google will allow Android users to beg a friend or family member to pay for an app

Ac7ionMan furious after “sociopath” sponsor drops out of tour

White sharks lurking in Massachusetts waters, New England Aquarium warns ahead of Memorial Day weekend

YouTube Music is now on compatible Garmin smartwatches

Transitional Administration Faces Stern Test — Global Issues

Related Articles

Haliey Welch explains why she won’t thank Tim & Dee TV for Hawk Tuah success

The 25 Best Cities To Buy a Home for Less Than $500K

Iron Butterfly ‘In-a-Gadda-Da-Vida’ Singer Was 78

Google will allow Android users to beg a friend or family member to pay for an app

Ac7ionMan furious after “sociopath” sponsor drops out of tour

White sharks lurking in Massachusetts waters, New England Aquarium warns ahead of Memorial Day weekend