The biggest surprise while building an AI verification system wasn’t the AI.

Over the past few weeks, I've been building a prototype that checks AI-generated financial claims against source documents.

I expected the hardest part to be the language model.

It wasn't.

The hardest part has been defining what "correct" actually means.

For example, imagine two documents in the same credit package:

A covenant certificate reports EBITDA as $12.4M

The management accounts report $11.9M

Neither document is necessarily "wrong."

One might exclude restructuring costs. The other might use the covenant definition from the credit agreement.

An AI can extract both numbers perfectly and still leave you with the real question:

Which definition should be used for this specific decision?

That made me realize something:

In many business workflows, the challenge isn't generating answers.

It's defining the rules that determine which answer is acceptable.

The AI isn't always the weakest link.

Sometimes our own business processes are.

For those of you building AI products:

Have you found that defining business rules was harder than building the AI itself?

I'd be interested to hear examples from other industries.