A Simple Checklist for Self-Evaluating Prompt Quality

How do you evaluate the quality of your prompt outputs? Here's a handy checklist. Let's have a look!

You can also join r/PromptWizards to find more tutorials and prompts!

Part 1: Understanding AI's Understanding

You've presented a prompt to your AI, the next questions are:

Has the AI accurately grasped the context?
1. If not, how can I make sure the LLM steers my context better, should I be more direct and clear in my prompt? Can I be less negative (shows to perform less) and be more guiding to the LLM?
Do the responses directly address the question or topic?
1. Was my query and task/instruction clearly detailed in enough depth that the LLM understood what I expect?
Are there any contradictions between different responses to the same prompt?
1. If I run my prompt multiple times, is the output consistent and reliable?
Are any repetitions apparent in the output, and if so, are they necessary?

Part 2: The Subtleties Matter

The AI's grasp of finer details can make a world of difference in the generated output. Reflect on these:

Part 3: Deep Evaluation of AI Output

The meaningful evaluation of your AI's output involves several key areas of consideration:

Was the output's length and structuring fitting for its intended use?
Did the AI handle nuances, complexities, or subtleties effectively?
Was the AI successful in executing multi-step tasks if they were part of the prompt?
If relevant, were past context or conversations incorporated well into the response?
Could additional guiding examples or context benefit the prompt?
Can the response's creativity, novelty, or depth be improved?

And finally,