The accountability gap we are not talking about
When a government department drafts a policy on AI oversight using AI and nobody checks, we have moved past carelessness into a structural problem.
Last week the Department of Home Affairs suspended two senior officials after AI-generated hallucinations were found in the reference list of the Cabinet-approved Revised White Paper on Citizenship, Immigration and Refugee Protection.
A Chief Director in the relevant unit was suspended with immediate effect. A Director involved in the drafting process was suspended the following Monday. Two independent law firms were appointed: one to manage the disciplinary process, one to review every policy document the department has produced since 30 November 2022. That date is not arbitrary. It is the date the first large language model was released to the general public.
The department noted that the hallucinated references appeared to have been generated and appended after the fact, as they were not cited in the body of the White Paper itself. The body of the document, it stressed, continues to accurately reflect the government’s policy position.
The minister said: “This unacceptable lapse proves why vigilant human oversight over artificial intelligence is critical.”
He is correct. But that statement leaves something important unresolved.
The same week
South Africa’s Draft National AI Policy was withdrawn after News24 found that at least six of the document’s 67 academic citations did not exist. The journals cited were real. The articles were not. Editors of multiple journals, including the South African Journal of Philosophy and AI & Society, independently confirmed the cited articles were fabricated.
The policy designed to govern AI was produced using AI, and nobody verified the outputs before publication.
Read that again. A government department tasked with drafting a framework for responsible and ethical AI use submitted a document to Cabinet with citations that do not exist. The hallucinations passed through the drafting process, the departmental review process, and Cabinet approval before a journalist checked the references.
The minister wrote on X that there would be consequences for those responsible: “This unacceptable lapse proves why vigilant human oversight over the use of artificial intelligence is critical. It’s a lesson we take with humility.”
Two ministers. Two departments. Two documents. Same week. Same failure mode.
This is not uniquely South African
In October 2025, Deloitte Australia was paid AU$440,000 to produce an independent assurance review of Australia’s welfare compliance system for the Department of Employment and Workplace Relations.
A law professor at Sydney University read the published 237-page report and found approximately 20 errors, including references to non-existent academic works, a fabricated quotation attributed to a federal court judge, and citations to a colleague’s work on a book that did not exist. “I instantaneously knew it was either hallucinated by AI or the world’s best kept secret,” the professor told the Associated Press.
Deloitte confirmed that Azure OpenAI had been used in the report’s production, issued a corrected version with a disclosure of AI use, and refunded the final instalment of the contract.
An Australian senator’s assessment: “The kinds of things that a first-year university student would be in deep trouble for.”
A Nature study published in 2025 found that 2.5 percent of academic papers that year contained at least one hallucinated citation, compared to 0.3 percent in 2024. These are not junior errors made by people new to research. These are credentialled professionals in trusted institutions producing work that passed through formal approval processes and still contained AI-generated content that nobody verified.
The pattern across all of these cases is not that AI was used. The pattern is that AI was used as a replacement for a cognitive step that the accountability structure assumed a human was performing.
What these incidents share
Finding a source, reading it, and confirming that it supports the claim being made is not an administrative task. It is an epistemic one. It requires judgment about whether the source is credible, whether the claim it makes is accurately represented, and whether it actually supports the argument being advanced.
When that step is delegated to a tool that produces plausible-sounding outputs without understanding them, the resulting document looks like verified work because the format is correct. The references exist. They have authors and journal names and publication years. They are formatted according to the citation style required. The accountability structure — peer review, Cabinet approval, departmental sign-off, consulting firm quality control — was not designed to distinguish between a citation that was found and read and a citation that was generated.
This is the accountability gap. Not that AI was used. That the processes downstream of AI use were not redesigned to account for the specific failure modes AI introduces.
Now the connection that concerns me more
Basic Education Minister Siviwe Gwarube announced in January 2026 that the matric class of 2025 achieved an 88 percent pass rate, the highest in South Africa’s history. In the same statement, she acknowledged that the pass rate “does not tell us about the quality of education outcomes.”
South Africa ranked last in most major global mathematics and science assessments in 2024. The minister herself pointed out that only 34 percent of candidates wrote Mathematics, with most opting for Mathematical Literacy. ActionSA’s analysis placed the effective cohort completion rate at 57.7 percent when measured against the number of pupils who entered Grade 10 in 2023.
A number went up. What it is supposed to measure did not necessarily follow.
I am not criticising the class of 2025. The dedication of learners under genuinely difficult conditions deserves recognition. I am asking a systems question that sits beneath the celebration.
If students are producing AI-assisted work that passes through assessment systems not designed to verify its provenance, the qualification certifies an output, not demonstrated competence. The credential becomes a record of what was produced, not fully of what was understood.
This is not a future concern. The same Nature study that found 2.5 percent of academic papers in 2025 contained hallucinated citations reflects a research culture where AI-assisted work is entering formal assessment systems at scale. The university researchers producing those papers sat matric, completed degrees, and entered professional practice through exactly the kind of credentialling pipeline that produces the officials who draft policy documents and the consultants who produce government reports.
The pipeline matters. At each stage, the same structural gap appears: a process designed to verify human cognitive work, now processing AI-assisted outputs without being redesigned to account for the difference.
The GRC professional’s view
AI governance is not a policy document. A policy document that was itself produced using unverified AI outputs is evidence of that.
Governance is a control at the point of process. Not a principle in a framework. Not an oversight body that reviews outputs after the fact. A control that fires before or during the activity it governs.
For AI use in professional work, that control has three components:
Declaration. Was AI used in producing this document? Who used it, and for which sections? This is not about prohibition. It is about transparency that makes verification possible.
Verification. Has a human checked the AI-generated outputs against primary sources? Not a scan. A check. Did someone read the cited articles and confirm they exist and say what the document claims they say?
Accountability. Who is responsible for the accuracy of the AI-assisted content? Not “the AI tool” and not “the junior official who used it.” A named person with authority over the output who has reviewed and signed off.
The Home Affairs response — designing AI checks and declarations as part of internal approval processes, appointing independent firms to review documents produced since November 2022 — is the correct remediation. It is also three and a half years late.
The Draft AI Policy response — withdrawing the document and restarting consultation — is the correct remediation. It is also the most ironic possible demonstration of what the policy was supposed to prevent.
The Deloitte response — refunding part of the fee, issuing a corrected document with a retroactive disclosure of AI use — is the correct remediation. It does not restore the credibility of the original review or the government processes that relied on it.
In each case, the control arrived after the failure. That is not governance. That is incident response.
The irony I want to name directly
I drafted this article using an AI tool. I verified every fact in it before writing a single sentence. Every statistic, every quote, every case detail was confirmed against primary sources before it appeared in this article.
The Home Affairs citations were generated and appended after the fact, without verification. The Deloitte references were generated and included in a paid government report, without verification. The AI Policy citations were generated and submitted to Cabinet, without verification.
The distinction between those two uses of the same technology is the entire argument. AI as a tool that accelerates work you then verify versus AI as a replacement for the work of verification.
Using AI to think faster is not the problem. Treating AI as a substitute for thinking is.
The question worth sitting with
South Africa is now restarting the consultation process for its national AI policy. The question worth sitting with is not only what the policy should say.
It is whether the process that produces it, approves it, and implements it is designed so that the human remains accountable for what the human signs.
The minister is correct that vigilant human oversight over AI is critical. The Home Affairs incident is evidence of what happens when that oversight is absent. The AI Policy withdrawal is evidence of the same. The Deloitte report is evidence of the same. The 2.5 percent hallucinated citation rate in academic papers is evidence of the same pattern at scale.
We are not missing a policy. We are missing a control at the point of process, at every level of the system where AI outputs are being accepted as verified work.
That is the accountability gap. And it appears at every level simultaneously.
Olebeng Molefe is an AI Governance and GRC practitioner and the founder of IntentGuard, an automated intent audit platform that compares codebase implementation against declared design intent. Waitlist at intentguard.dev.
Sources:
Department of Home Affairs media statement 30 April 2026;
AIC reporting on SA Draft AI Policy withdrawal;
Fortune, CFO Dive, Above the Law on Deloitte Australia October 2025;
Nature citation study data via AIC;
BusinessTech, Daily Maverick, Mail & Guardian on 2025 NSC matric results.
