Enterprise DNA

Omni by Enterprise DNA

Enterprise DNA Resources

Thought leadership & research. Practical AI operating-system thinking for owners, operators, and teams doing real work.

220k+

Data professionals

Omni

AI agents and apps

Audit

Map the manual work

Key Findings

Most AI pilots fail because owners track the wrong numbers. Here's what actually matters: hours recovered and revenue protected.

How to Measure AI ROI Without Fake Metrics
Insight ai

How to Measure AI ROI Without Fake Metrics

Sam McKay

I see this every week. A business owner shows me their AI dashboard with impressive numbers: “We’ve processed 10,000 documents!” or “Our AI handled 500 customer interactions!” They’re excited. They spent money, implemented tools, got their team trained. But when I ask what changed in their P&L, I get blank stares.

The numbers look good. The activity is real. But nobody can tell me if they made more money or freed up their best people to do higher-value work. That’s the problem with how most firms measure AI. They’re tracking motion, not outcomes.

After running audits with 220,000+ professionals and working directly with dozens of service firms, I’ve watched this pattern repeat. Owners get sold on AI promises, implement tools, measure the wrong things, then wonder why their margins haven’t improved. The vendors are happy because you’re using their product. Your team is busy because they’re feeding the AI. But your business isn’t materially better.

The Vanity Metrics That Waste Your Time

Most AI measurement frameworks are borrowed from enterprise software companies or tech startups. They track usage rates, adoption percentages, and processing volumes. These metrics make sense if you’re selling AI software. They’re useless if you’re running a professional services firm or trades business.

Here’s what I mean. A consulting firm implements an AI tool to help with proposal writing. After three months, they report that 80% of their team has used it and they’ve generated 200 proposals with AI assistance. Sounds great. But when we dig into the actual business impact, we find that proposal win rates stayed flat, the sales cycle didn’t shorten, and senior consultants are still spending the same hours reviewing and fixing AI-generated content.

The AI got used. The activity happened. The business didn’t improve.

I’ve seen accounting firms track how many tax returns their AI reviewed. Marketing agencies count how many social posts their tools generated. Construction companies measure how many project documents got processed. All motion. None of it tied to whether they made more money or got their time back.

The real issue is that these metrics let you feel productive without being profitable. You can show your team charts that go up and to the right. You can justify the software spend to your bookkeeper. But you can’t point to additional revenue or recovered capacity that lets you take on better clients.

What Actually Matters: Hours and Revenue

After working through this problem with dozens of firms, I’ve landed on two metrics that matter: hours recovered and revenue protected. Everything else is noise.

Hours recovered means actual time that your team used to spend on a task that they no longer spend because AI handles it. Not “AI assists with it” or “AI speeds it up.” Full recovery. The task happens without human intervention, or the human time required drops to near zero.

Revenue protected means money you would have lost without AI intervention. This shows up in three ways: catching errors before they cost you, maintaining service quality as you scale, and preventing client churn by keeping delivery standards high.

These two metrics work because they connect directly to your business model. If you bill by the hour, recovered hours either increase your effective rate or free up capacity for more billable work. If you’re project-based, recovered hours improve your margins or let you take on more projects. Revenue protected keeps your existing business healthy while you grow.

Let me give you a concrete example. An engineering firm I worked with implemented AI to handle initial technical document review. Before measuring anything fancy, we established the baseline: senior engineers spent an average of 4 hours per project doing initial document checks. The firm ran 40 projects per quarter. That’s 160 hours of senior engineer time.

After three months with AI doing first-pass review, senior engineers spent 30 minutes per project on document checks instead of 4 hours. That’s 140 hours recovered per quarter. At their billing rate of $200/hour, that’s $28,000 in recovered capacity every quarter. They could either bill that time to clients or redeploy it to business development. Clear number. Clear value.

The revenue protection piece came from the AI catching technical errors that would have made it through initial review. Over six months, it flagged issues in 8 projects that would have required rework. Based on their historical rework costs, that saved them somewhere between $15,000 and $25,000 in write-offs and client relationship damage.

No fake metrics. No usage dashboards. Just hours back and money saved.

Why Most Firms Get This Wrong

The reason firms track vanity metrics instead of real outcomes comes down to three problems.

First, they let the AI vendor define success. Software companies want you to use their product more. Higher usage means you’re less likely to cancel. So they build dashboards that show usage, adoption, and activity. These metrics serve the vendor, not you. When you accept their measurement framework, you’re measuring their success, not yours.

Second, they don’t establish clean baselines before implementing AI. You can’t measure hours recovered if you don’t know how many hours the task took before. Most firms have a vague sense that something takes “too long” but they’ve never actually tracked it. So when AI comes in, they have no comparison point. They end up measuring AI activity instead of AI impact because they can’t measure the counterfactual.

Third, they’re afraid of the real numbers. If you measure hours recovered and the number is small, you have to admit the AI isn’t working. If you measure revenue protected and you can’t find examples, you have to question whether you needed the tool at all. Vanity metrics let you avoid these uncomfortable truths. Real metrics force you to make decisions.

I’ve sat in meetings where owners resist this approach because they’re worried about what they’ll find. They’ve already spent money. They’ve already told their team this is important. They’ve already committed to the direction. Measuring real outcomes might reveal that it’s not working, and then they’d have to either fix it or kill it. Easier to track usage rates and call it a win.

What to Do This Quarter

If you want to measure AI properly, here’s what actually works. These are moves you can make in the next 90 days that will give you real data about whether your AI investments are paying off.

Pick one workflow where you’ve implemented AI and establish the before state. Go back through records, talk to your team, reconstruct how much time that task actually took before AI. Get specific. Not “a few hours” but “2.5 hours on average, with a range of 1-4 hours depending on complexity.” If you can’t reconstruct the before state, start tracking the current state now so you have a baseline for the next change.

Define what “recovered” means for that workflow. This is where most firms get squishy. They want to count time where AI “helps” or “speeds things up.” Don’t. Only count time that’s fully recovered. If your team still needs to review AI output for 30 minutes, that’s not recovered time. That’s reduced time. Recovered means the human is out of the loop or their involvement is negligible. Be strict here. Better to undercount real recovery than overcount partial assistance.

Track revenue protection by looking for near misses. Set up a simple system where team members flag cases where AI caught something that would have caused a problem. Don’t try to calculate exact dollar values for every catch. Use ranges. “This would have cost us between $2,000 and $5,000 in rework” is fine. You’re looking for patterns, not precision. After a quarter, you’ll see whether AI is actually protecting revenue or just adding steps.

Stop tracking and reporting vanity metrics. If you’re currently measuring usage rates, adoption percentages, or processing volumes, quit. These numbers will only distract you from what matters. When your team asks about AI performance, show them hours recovered and revenue protected. When your bookkeeper questions the software cost, show them the same two numbers. If you can’t make the case with these metrics, you can’t make the case at all.

Set a threshold for what “working” means. Before you measure anything, decide what success looks like. For most firms in the 5-50 person range, AI should recover at least 40-60 hours per quarter per implementation to justify the cost and overhead. Revenue protection is harder to threshold because it’s lumpy, but you should see at least a handful of meaningful catches per quarter. If you’re not hitting these levels, either the implementation is wrong or you picked the wrong workflow to automate.

The Real Test

Here’s how you know if your AI measurement is honest. Can you explain the ROI to your most skeptical team member in under two minutes using only hours and revenue? If you need a dashboard, a presentation, or a detailed explanation of how the AI works, you’re not measuring the right things.

The conversation should sound like this: “We implemented AI for document review. It used to take our senior people 4 hours per project. Now it takes 30 minutes. That’s 140 hours back per quarter. We bill those hours at $200, so that’s $28,000 in recovered capacity. It also caught 8 errors that would have cost us rework. The software costs $400 a month. Clear win.”

If you can’t have that conversation, you either don’t have real ROI or you’re measuring the wrong things. Most firms can’t have that conversation. They talk about adoption rates and processing volumes because they don’t have the hours and revenue numbers. Or they have the numbers and they’re not good, so they hide behind activity metrics.

I’ve been building analytics systems and training people on data for over a decade. The pattern is always the same. When the numbers are good, people show you the numbers. When the numbers are unclear, people show you activity. If your AI reporting is full of charts about usage and adoption, that tells me everything I need to know about whether it’s actually working.

The firms that get this right are the ones that treat AI like any other business investment. They measure it the same way they’d measure a new hire, a marketing campaign, or a piece of equipment. Does it make us money or save us money? How much? Is that enough to justify what it costs? Simple questions. Clear answers.

Moving Forward

Most business owners I talk to know their current AI measurement is weak. They’re tracking things that don’t matter because they don’t know what else to track or they’re afraid of what honest measurement will reveal. But you can’t manage what you don’t measure properly, and you can’t improve what you’re not willing to evaluate honestly.

The good news is that fixing this doesn’t require sophisticated analytics or expensive consultants. It requires discipline about what you measure and honesty about what the numbers tell you. Hours recovered and revenue protected. Everything else is distraction.

If you want to know whether your AI investments are actually working, or if you’re trying to figure out where AI could genuinely help your business, I run a 60-minute Omni Audit where we look at your workflows and identify the highest-value opportunities. No sales pitch, no generic advice. Just a practical assessment of where AI can recover hours or protect revenue in your specific business.

Book a session here: https://calendly.com/sam-mckay/discovery-call?utm_source=edna-landing&utm_medium=insights&utm_campaign=insight-measure-ai-roi

We’ll figure out what’s worth measuring and what’s worth ignoring. Then you can make decisions based on real numbers instead of vendor promises.