Beyond the "Looks Good": Objectively Evaluating AI Outputs

Jun 19, 2025

Hello BAE Community Members,

An AI generates 20 user stories. Great!

But how do you, as a BA, objectively know if they're any good?

Relying on a gut feeling isn't enough when business value is on the line.

The thought-provoking part?

We need metrics. Just as you'd measure system performance, start thinking about simple "precision/recall" for requirement extraction.

Did the AI get all the key details (recall)?
Were all the details it provided relevant (precision)?
Or, for qualitative outputs, can you score it on relevance, completeness, and tone?

Your role isn't just about getting an answer from AI, but ensuring it's the right answer that meets business requirements.

What three criteria would you use to quickly assess the quality of an AI-generated project deliverable, like a requirements set?

All the best,

Esta

You can find my more of my work here: