Beyond the "Looks Good": Objectively Evaluating AI Outputs
How do you, as a BA, objectively know if your AI generated user stories are any good?
Hello BAE Community Members,
An AI generates 20 user stories. Great!
But how do you, as a BA, objectively know if they're any good?
Relying on a gut feeling isn't enough when business value is on the line.
The thought-provoking part?
We need metrics. Just as you'd measure system performance, start thinking about simple "precision/recall" for requirement extraction.
Did the AI get all the key details (recall)?
Were all the details it provided relevant (precision)?
Or, for qualitative outputs, can you score it on relevance, completeness, and tone?
Your role isn't just about getting an answer from AI, but ensuring it's the right answer that meets business requirements.
Consider this:
What three criteria would you use to quickly assess the quality of an AI-generated project deliverable, like a requirements set?
All the best,
Esta
You can find my more of my work here: