Cory McNeley is a Managing Director at UHY Consulting.
As AI systems start to dominate the marketplace, concerns regarding accuracy and precision are becoming more prevalent. The convenience of these systems is undeniable: They can answer complex questions in minutes, save us time and help us create content. But what if the information you’re relying on isn’t just wrong—it’s completely fabricated? AI models are designed to sound right even when they’re shooting from the hip, so they can be extremely convincing. They often present information to justify their position, making it difficult to distinguish fact from fiction. This raises another question: Can you trust AI with complex, high-stakes tasks?
What causes hallucinations?
These errors or—as they’re called in the industry, hallucinations—are often attributed to knowledge gaps caused by the parameters and information loaded into the system. What’s often overlooked is the fact that AI is designed to keep you coming back for more by, in short, making you happy.
In the case of knowledge gaps, you can train AI to successfully identify the make and model of a vehicle on vast amounts of images, but it may identify other items as a vehicle because it doesn’t have context. In the case of making its users happy, if the user doesn’t point out that the returned information is wrong, the AI will not acknowledge the strength of its results or, in some cases, even deny it made a mistake.
AI is also capable of generating extremely complex, detailed and convincing lies. OpenAI released a report that essentially said that when AI is punished for lies, it learns to lie better. AI systems fill knowledge gaps by predicting plausible information based on patterns. The takeaway? While hallucinations may seem like lies, they’re simply gaps in its data or the expression of unintended sub-objectives inherent in all AI.
Recently, I tested the advanced deep research capabilities of OpenAI to validate some information for an article I was working on. I prompted the model to provide trends and citations “on how AI is transforming factories into Industry 4.0.” After approximately 15 minutes, I received a collegiate-level report that detailed trends and cited case studies from various consulting firms and manufacturers I was familiar with. Overall, it was a highly engaging read that caught my attention. The statistics seemed sound, the application seemed relevant and the quotes were ideal, as if they were tailored to my request. The problem was my deep research contained heavily fabricated facts and citations that linked to sources that were not relevant and a completely fabricated case study that coincidentally was for a company that is my client.
What is the first step in protecting your business from hallucinations?
First and foremost, it’s important to verify the information presented by any AI model. Be critical when dealing with topics like finance, healthcare or anything that is materially impactful. Always cross-reference from multiple sources if you’re not familiar with the topic. Double-check with another source and have AI review its own information to ensure it’s correct. I often tell ChatGPT to review its output as an overly critical boss. Oftentimes, you’ll find that the system will identify some of the initial gaps.
How can you improve your prompts to mitigate hallucinations?
When using AI, be very specific. When you’re prompting, break down complex prompts into smaller prompts. Use these prompts to build upon each other so that you can refine results as they are generated. Tell your AI engine what you want to see, how you want to see it and, more importantly, what you do not want to see. Provide basic guardrails to ensure the AI model is looking at the correct information when formulating its response.
There are several different types of prompting methods, but I have found the PICKLE method is one of the more effective and easy-to-remember approaches:
• Problem/Purpose: Define the challenge, goal or intent behind your prompt. In other words, why are you asking this?
• Information: Provide necessary context, background or data. What does the AI (or person) need to know?
• Constraints: Set boundaries: time, tone, format, tools, style or rules to follow.
• Knowledge Required: Specify any expertise, frameworks or reference points to guide the response.
• Line Of Thought: Suggest how you want the reasoning or structure to flow.
• Expected Output: Describe the deliverable: format, detail level, etc. What should the result look like?
Which AI model should you use?
When possible, use an AI model designed for specific tasks. Choosing the right model for the job creates guardrails around data that it can access. Most general-purpose AI models are trained on enormous datasets that are publicly available and that cover everything you can think of, including less-than-factual data sources like Reddit and X (formerly known as Twitter). The sheer size of these systems makes them versatile, but that also makes them prone to mistakes. For example, if you’re using AI in an accounting or enterprise resource planning (ERP) system, it’s a good idea to use a specialized model trained on internal controls and accounting standards. Make sure that that model has incorporated the accounting requirements of your organization.
When using your own private instance of a language model, augment the information with knowledge items and prevent the model from using any external data. You are now providing the AI model with a basis to formulate its opinion. This won’t always guarantee your system won’t hallucinate; however, it does minimize the occurrence of hallucinations. In cases where word usage is highly specific and purposeful, even more context is needed to help the AI stay on track. Small differences in phrasing, such as “before,” “after,” “under,” “over” or “not until,” also call for special instruction because of the way these models ingest pieces of words and chunks of sentences.
With each step forward, we will have to learn to guide these models to get the most out of them and to know when they are wrong. When asked for proof, AI can invent it without pause. My advice is to trust but verify—especially when the machine is confident.
Forbes Business Council is the foremost growth and networking organization for business owners and leaders. Do I qualify?
Read the full article here