One of the world’s largest professional services firms has been forced to withdraw a major AI report after multiple organizations said the claims it made about them were either false or completely made up.
KPMG published “Rethinking Excellence in the Era of Agent AI” in October 2025. It positioned itself as a forward-looking guide to agentic AI adoption for enterprise leaders, complete with case studies and citations from well-known organizations. Eight months later, the report is gone from KPMG’s website, and the firm is running an internal investigation.
What Went Wrong
The problems surfaced when technology research group GPTZero investigated the report’s citations and case studies. Their findings were stark.
Of 45 citations in the report, only five accurately pointed to real, uncorrupted sources. Twenty-eight citations provided paraphrased titles and fabricated components for real sources. GPTZero flagged 40 citation titles in total as likely AI-generated hallucinations.
GPTZero described the pattern as “vibe citing,” where AI-generated content produces references that look credible on the surface but fall apart under scrutiny.
The citation failures were not the only problem. UBS, the NHS (National Health Service of the United Kingdom), Swiss Federal Railways (SBB), and Transport for London all told the Financial Times that the report’s claims about their use of AI were either false or misleading. These were not fringe organizations being misquoted. They were named prominently as AI success stories, and they pushed back.
The report also contained internal inconsistencies. In one instance, it cited KPMG’s own research claiming 55% of CEOs rank AI as their top investment priority, while KPMG’s separately published 2025 CEO Outlook, released the same month, put that figure at 71%. Roughly half the report’s factual claims appear false or misattributed when checked against primary sources.
KPMG removed the report from its websites and confirmed an internal review is underway.
Why This Matters
This story would be quietly embarrassing if KPMG were a small startup that published sloppy content. But KPMG is a firm with tens of thousands of consultants who charge enterprise clients significant fees, in part because those clients trust that their advice is researched and accurate.
The report is about AI. The irony writes itself: a report arguing for wider AI adoption in enterprise was apparently produced with insufficient human oversight of the AI tools used to generate it.
This is not the first time a major organization has published AI-assisted content containing hallucinated citations. A similar issue emerged with a government health report in the US in 2025. But KPMG’s scale, and the prominence of the organizations who disputed the claims, makes this incident particularly notable.
It is also a signal worth paying attention to as agentic AI moves deeper into knowledge work. When AI generates content, citations, and case studies without a human genuinely verifying each claim, the results can be plausible-sounding fabrications. They can get through internal review processes that are not designed to catch AI-native failure modes.
What This Means for Business
For business leaders evaluating AI tools and AI-assisted outputs, this incident highlights something practical: where AI is involved in producing work product, verification of specific claims cannot be assumed. Hallucinations in creative writing are annoying. Hallucinations in client-facing research reports, vendor assessments, regulatory submissions, or financial analysis create real liability.
Three practical takeaways for organizations building with or procuring AI:
Human sign-off on specific claims, not just the overall output. When an AI tool generates a citation or a factual claim, someone in your organization needs to check that specific fact against its original source. Reading the document and it “seeming right” is not sufficient. KPMG’s report almost certainly went through internal review before publication. What it lacked was a process for verifying individual citations against their sources.
AI-generated research needs different governance than human-written research. The failure mode here, “vibe citing,” is specific to how language models work. They are trained to produce text that sounds like good research. They are not trained to be accurate about specific citations. Organizations need review processes that specifically look for this class of error, not just general quality assurance.
Third-party AI-assisted reports require the same scrutiny as your own. When a consulting firm, research house, or vendor sends you a report that includes AI-generated content, the responsibility for verifying what you act on does not sit entirely with the author. As AI accelerates the volume of reports, assessments, and recommendations flowing into organizations, the assumption that “they checked it” is increasingly risky.
The KPMG incident is also a reminder of why data quality and analytical rigor matter as foundations for AI work. AI tools amplify what exists in the data and processes around them. If the underlying process lacks discipline, the AI output can fail confidently and at scale.
This is exactly the kind of risk that a well-structured data and AI strategy addresses upfront, well before reports go to clients or reports go live. If your organization is scaling AI-assisted work and is unsure whether your governance processes are keeping pace, EDNA’s Omni Advisory service helps business leaders build the frameworks to use AI well, not just use it fast.
Source
TechCrunch