Deloitte Faces Backlash Over AI Errors In Government Report

Deloitte, one of the world’s most prominent consulting firms, is under scrutiny in Australia after it agreed to partially refund the federal government for a $440,000 report riddled with errors—many apparently generated by artificial intelligence. The saga, which has sparked fierce debate about the use of AI in high-stakes professional services, unfolded over several months and has prompted calls for greater transparency and accountability from both consultants and the government departments that hire them.

The controversy centers on a 237-page report commissioned by Australia’s Department of Employment and Workplace Relations (DEWR) to assess the department’s compliance framework and IT system for automating penalties in the welfare system. According to The Associated Press, the report was originally published on the DEWR’s website in July 2025, but quickly attracted attention after Dr. Christopher Rudge, a researcher in health and welfare law at the University of Sydney, flagged what he described as “fabricated references.”

Rudge told AP he found up to 20 errors in the initial version, including a fabricated quote from a federal court judgment and references to nonexistent academic research papers. One glaring example was a citation of a book supposedly written by Professor Lisa Burton Crawford, which, as Rudge put it, “sounded preposterous.” He explained, “I instantaneously knew it was either hallucinated by AI or the world’s best kept secret because I’d never heard of the book and it sounded preposterous.”

These so-called “hallucinations”—a term commonly used to describe AI-generated fabrications—were not limited to academic references. The report also misquoted a judge, a particularly serious error in a document that was meant to serve as an audit of the department’s legal compliance. “They’ve totally misquoted a court case then made up a quotation from a judge and I thought, well hang on: that’s actually a bit bigger than academics’ egos. That’s about misstating the law to the Australian government in a report that they rely on. So I thought it was important to stand up for diligence,” Rudge said, as reported by AP.

Once the errors were brought to light, Deloitte reviewed the report and “confirmed some footnotes and references were incorrect,” according to a statement from the department. The firm agreed to repay the final installment under its contract, though the exact amount will be made public only after the refund is processed. A revised version of the report was published on October 3, 2025, with more than a dozen deletions of nonexistent references and footnotes, a rewritten reference list, and corrections to typographical errors, as detailed by The Guardian and Australian Financial Review (AFR).

Importantly, the updated report also included—for the first time—a disclosure that a generative AI language system, Azure OpenAI GPT-4o, was used in its preparation. Deloitte stated in its appendix that the methodology “included the use of a generative artificial intelligence (AI) large language model (Azure OpenAI GPT–4o) based tool chain licensed by DEWR and hosted on DEWR’s Azure tenancy.” However, the company did not explicitly link the errors to the use of AI, instead saying, as AP reported, that the “matter has been resolved directly with the client.”

Despite the corrections, the department maintained that “the substance of the report had been maintained and there were no changes to its recommendations.” Dr. Rudge, for his part, acknowledged that while the report contained AI “hallucinations,” its conclusions generally aligned with other evidence. Still, he expressed concern that the errors reflected a broader issue of diligence and oversight.

The reaction from Australia’s political sphere was swift and pointed. Senator Barbara Pocock, the Australian Greens party’s spokesperson on the public sector, argued that Deloitte should refund the entire AU$440,000, not just the final installment. “Deloitte misused AI and used it very inappropriately: misquoted a judge, used references that are non-existent,” Pocock told the Australian Broadcasting Corporation. “I mean, the kinds of things that a first-year university student would be in deep trouble for.”

Labor Senator Deborah O’Neill was similarly scathing, describing the episode as a “human intelligence problem.” She told AFR, “This would be laughable if it wasn’t so lamentable. A partial refund looks like a partial apology for substandard work. Anyone looking to contract these firms should be asking exactly who is doing the work they are paying for, and having that expertise and no AI use verified.”

The story quickly gained traction online, with social media users expressing a mix of outrage and disbelief. “What AI-led productivity looks like in practice,” one user quipped, while another remarked, “AI is bullshit; realize that before it’s too late, ffs.” Others directed their ire at both Deloitte and the government: “Shame on Deloitte. Bigger shame on Govt: $440k contract for rubbish work.”

For Deloitte, the incident is a cautionary tale about the risks of relying too heavily on generative AI for critical analytical tasks—especially when transparency about its use is lacking. The company, which boasts a global workforce of more than 470,000, recently announced a deal to provide Anthropic’s Claude AI to its employees worldwide, signaling its commitment to integrating AI into its operations. Yet as Harvard Business Review and Wall Street Journal have noted, the rush to automate routine work with AI poses questions about oversight, expertise, and the future of professional services. Who will ensure the quality and integrity of work when machines, not humans, do the heavy lifting?

This episode also shines a spotlight on the responsibilities of government agencies when commissioning external reviews. As Senator O’Neill emphasized, verifying the expertise of consultants—and their methods—should be paramount. Otherwise, taxpayers may end up footing the bill for “rubbish work,” as one critic put it, with little recourse beyond partial refunds and public embarrassment.

Ultimately, while Deloitte’s partial refund may resolve the immediate contractual issue, the broader debate about AI’s role in professional services is just beginning. As generative AI becomes more deeply embedded in consulting, law, and government, the need for transparency, accountability, and good old-fashioned human intelligence has never been clearer.

Deloitte Faces Backlash Over AI Errors In Government Report

A $440,000 welfare system review by Deloitte is revised and partially refunded after AI-generated errors spark political and public criticism in Australia.