Tactics to Debug Generative AI Prompts
When was the last time you got what you wanted on your first prompt? Check out 4 tactics for debugging a complex AI prompt.
When was the last time you got the result you wanted on your first prompt?
Assuming it wasn’t a simple question like “What’s the name of the river in London?”, it was probably some time ago.
After all, GPTs have a special talent for delivering results that are almost but not quite right, especially if the prompt is complicated. Pushing it from 90% to 95% and 100% correct gets extremely challenging and requires techniques like prompt chains or prompt chunking.
At AiSDR, whenever we run into a situation where a prompt’s not generating what we want, these are some tactics we turn to.
TLDR:
- The goal: Get a generative AI prompt to return the desired result
- The tactic: Proceed through different methods for debugging AI prompts
- The result: AI prompts generate the desired result
Tactic 1: Give more examples of the expected output
Providing examples of what you want to see is the easiest way to get better prompt results.
That’s because examples act like a guide that helps the GPT produce results by replicating the example. It also reduces the chances of misinterpretation and misunderstandings.
For example, imagine you ask a GPT to “list the features of a new smartphone” and you get this result:
New smartphones have a high-resolution camera, long battery life, and fast processor.
Let’s say you want a bulleted list instead. You can let the GPT know what you want by adjusting the prompt accordingly:
List the features of a new smartphone. Here’s an example of the style I want:
New smartphones have:
- [Feature 1]
- [Feature 2]
- [Feature 3]
After re-entering this prompt into a GPT, you should get a result like this:
New smartphones have:
- High-resolution camera
- Long battery life
- Fast processor
Tactic 2: Ask the GPT to return the result in JSON
Asking a GPT to provide results in JSON can improve the quality of generative AI results in certain situations.
Here are a few reasons why:
- Improved structure – When you request outputs in JSON, you’re asking for a structured format. This can help organize information consistently, reduce ambiguity, and make it easier to parse results programmatically.
- Specificity – JSON helps the GPT focus on providing informative clearly and concisely, reducing the likelihood of irrelevant details.
- Validation – JSON’s structure means it’s easier to validate when responses match your expectations. If the GPT returns invalid JSON, you can quickly isolate what went wrong.
- Reusability – JSON-formatted responses can be easily recycled for other data processing purposes.
Requesting outputs in JSON won’t necessarily improve output quality, but it does make it easier to simplify a request’s framing, which can come in handy when working with structured data or implementing prompt caching.
The downside of this tactic is that if you’re not familiar with JSON, it could take some getting used to.
No time to learn like the present? 🤓
Tactic 3: Replace errors with placeholders for future filtering
Imagine you’ve got a certain pesky error that’s ruining your prompts, such as a tendency to add double quotes (“ “).
Your first reaction would probably be to include this instruction:
DON’T use double quotes.
This might or might not work.
If it does, great! Then you can move on 🙂
But if it doesn’t, you can try this approach:
Replace double quotes with “N/A” when you encounter them.
This tactic allows you to filter out and delete the placeholder programmatically or with a simple “Find and replace” function.
Tactic 4: Structure the prompt to carry out multiple steps
Multi-step instructions are a helpful way to structure prompts so they complete and report several tasks in one output.
Imagine your original prompt was something like this:
Provide new smartphone feature, prices, and reasons why they’re important.
Restructuring the prompt so that it outlines multiple steps can improve output quality. An example of this sort of prompt would be:
List the features of a new smartphone, explain why each feature is important, and estimate the price of each feature.
When creating complex prompts, multi-step instructions allow the AI to approach the task methodically and essentially “check the boxes”.
The Result
If all goes well, your prompt should start reliably delivering the result you want.
Generally, these debugging tactics should work for most generative AI models, if not all. But if you are using several models (e.g. Claude 3.5 Sonnet, GPT-4o, GPT-o1), you’ll also need to pay attention to model-specific elements such as markup.