COLUMN
Contents
Three Structural Reasons AI Agent Projects Stall at Proof-of-Concept
— A PMO Perspective
What you need to design now to avoid Gartner’s warning of 40%+ cancellations by 2027.
Introduction — “It’s Running, But Nothing Has Changed”
In June 2025, Gartner issued a stark prediction.
“More than 40% of agentic AI projects will be cancelled by the end of 2027 — due to escalating costs, unclear business value, and inadequate risk management.”
How many readers thought, “we’re fine, that won’t be us”?
From years of supporting large-scale projects as a PMO practitioner, that estimate feels conservative. When you factor in projects that technically reached production but were quietly abandoned six months later because nobody was using them, the cases that can genuinely be called successes are a small minority.
McKinsey’s global AI survey found that fewer than one in five companies — less than 20% — actually track KPIs for their generative AI solutions. In other words, more than 80% of organisations running AI projects cannot tell whether they have succeeded or failed. The pervasive sense in the field that “it’s running, but nothing seems to have changed” stems directly from this measurement gap.
So why does this keep happening?
The answer is rarely technical. In most cases the problem is structural. And that structural flaw is not unique to AI projects — it mirrors failure patterns that have been repeating in project management for decades.
This article identifies three structural reasons why AI agent initiatives stall at proof-of-concept, and offers a PMO-grounded prescription for each.
Wrong Sequencing — The danger of “let’s define success as we go”
A phrase has become alarmingly common in AI project kick-off meetings.
“Let’s just get it running. We’ll define success criteria along the way.”
On the surface, this sounds reasonable. You often can’t know what AI is capable of until you start. Agile delivery is widely advocated. But the moment that phrase becomes the reason for not defining KPIs, the project begins to die quietly.
Six months later, without exception, a specific meeting arrives. “So — did this succeed or fail?” Nobody in the room can answer. A frantic attempt to define KPIs after the fact reveals that data was never collected in the right way. Either a retrospective “success” definition gets invented, or the evaluation simply gets buried. The majority of Gartner’s “unclear business value” cancellations follow exactly this pattern.
The same structural failure as classic IT projects
This is not a new problem. Large IT projects have been failing the same way for decades. Teams invest heavily in technical specifications but start without anyone being able to answer: “When this project is over, what will be different for the client — and how will we know it worked?”
AI projects amplify this tendency. Because the cost of “just getting something running” has dropped sharply, the habit of thinking carefully before starting has eroded alongside it.
Prescription: design KPIs and evaluation criteria before you ship
In CPP (Calibrated Project Planning), the Sequencing axis does not simply mean mapping task dependencies. It means designing the order in which decisions are made — accounting for politics, technical constraints, resources, and failure scenarios. Applied to AI projects, sequencing means this:
Use the first 30 minutes of the kick-off to ask every stakeholder to write, in one sentence: “What would this AI system have to look like in three months for us to call it a success?”
If nobody can answer, the project should not begin. This is not difficult. But the number of organisations actually doing it is vanishingly small.
A 2025 Deloitte survey of 1,854 senior executives found that only 20% of organisations have mature governance frameworks for autonomous AI. Behind that figure lies a mass of organisations that tried to build governance after the fact — and ran out of time.
No Ownership — Nobody is designated to make the call
Every AI agent eventually reaches a moment where it stops.
An unexpected input arrives. An exception needs to be handled. A decision point appears where two paths are equally plausible. The agent pauses and passes the question to a human: “Who decides this?”
The AI does not carry the answer. That is a human responsibility.
And in most organisations, nobody can answer. The project was launched on the assumption that “someone will handle it” — but who that someone is was never explicitly defined.
UC Berkeley’s California Management Review put it plainly in March 2026: “Current governance models are not equipped for software that autonomously perceives, decides, and acts.” This is not an AI-specific problem. It is a human-and-organisation problem.
The same issue that causes WBS failures in large projects
The pattern appears repeatedly in post-mortems of large project failures. A WBS exists. Tasks are listed. But “who makes the call if something goes wrong with this task?” has no answer. The issue log fills up with items stuck at “awaiting decision” for three, four, five weeks. Action items from meetings sit unclaimed, carried forward meeting after meeting, because “someone will sort it.”
Deploying an AI agent without filling that decision vacuum changes nothing. When the agent stops and the answer is still “someone will handle it,” you have the same problem you had before — now running faster.
Prescription: create an authority matrix before deployment
The CPP Ownership axis means designing decisions to the level of granularity where each individual can act independently. Applied to AI, it means defining upfront: what the AI decides, what a human decides, and which human is accountable for each category.
The practical tool is an Authority Matrix, created before the agent goes live. It does not need to be complex — one page is enough.
| Decision Type | AI Decides | Human Decides | Designated Owner |
|---|---|---|---|
| Automated data retrieval | ○ | — | — |
| Exception data handling | — | ○ | Dept. Head (Tanaka) |
| Write to external systems | — | ○ | Team Lead (Suzuki) |
| Error response | — | ○ | IT on-call (rotating) |
What matters is that when the agent stops, everyone in the organisation can see immediately who takes the next step. As Deloitte’s survey confirms, only 20% of companies have mature autonomous AI governance. The other 80% are operating AI systems without this map.
Scope Overreach — The “automate everything” trap
“We want to roll this AI out company-wide.” “We want to automate the entire process.”
An AI initiative that starts from either of those sentences has never, in my experience, succeeded.
Successful deployments share a striking common feature: the initial scope is almost shockingly small.
KPMG’s 2025 case study is the clearest example.
KPMG deployed AI agents in audit work and reduced effort by 35%. But the goal was never “automate the entire audit.” They built dedicated specialist agents for specific tasks: scoping the population, completing disclosure checklists, detecting journal-entry anomalies, and drafting working papers. One focused agent per task. That is what produced the 35% reduction.
Why starting broad always fails
Project management research shows a clear relationship between scope breadth and completion rates. Narrowly scoped projects complete on time 65% of the time. Broadly scoped projects: 16%. The same dynamic applies to AI deployments. Three mechanisms explain it.
Prescription: deliberately choose your first one
The CPP Granularity axis means decomposing work to the unit at which even the least skilled team member can execute independently. Applied to AI, it means constraining the initial scope to the single workflow most likely to deliver a concrete, measurable result.
Start every AI deployment by answering this question:
“In this organisation, what is the one task where AI can most reliably deliver value?”
The “one task only” constraint will feel restrictive at first. It is also the reason KPMG achieved 35% reduction while other, more ambitious programmes produced little. One success builds trust, provides justification for the next step, and teaches the team how to work with AI in practice.
The Changing Role of the PMO
Looking across the three prescriptions above, a pattern emerges: none of them require specialist AI knowledge. Define KPIs before you start. Create an authority matrix. Constrain the initial scope. These are project management fundamentals.
What has changed is not the tools — it is the stakes.
Only did what they were explicitly instructed to do. Even when the design was weak, humans could course-correct in real time on the ground.
Perceive, decide, and act autonomously — sometimes faster than humans can intervene. This is precisely why the upfront structural design matters more, not less.
The PMO’s role is not “making AI run.” It is “designing the structure within which AI runs correctly.”
That shift is not a threat to the PMO function — it is an expansion of its value. Technical knowledge becomes obsolete quickly. The ability to design how organisations and their people operate becomes more valuable as AI becomes more prevalent. The fact that AI agent failure patterns and project failure patterns share the same structure is not a coincidence. Both are, at their core, problems of how people and organisations move.
Summary
Three structural reasons why AI agent initiatives stall at proof-of-concept:
None of these are AI-specific technical problems. All three mirror failure patterns that have been recurring in project management for decades. The structures are the same because the underlying challenge is the same: how do people and organisations actually move?
Design the structure first. The technology follows. That is the shortest route to an AI agent deployment that works — and keeps working.
Consulting on AI Agent Deployment
and Large-Scale PMO Engagements
We help organisations design the structure before the technology — so AI projects move from proof-of-concept to production and stay there. Initial consultations are complimentary.
Get in touch →
Primary Sources
Gartner Press Release (June 2025): Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027
McKinsey Global AI Survey (2026): The State of AI — How Organizations Are Rewiring to Capture Value
Deloitte (2025): State of AI in the Enterprise — Survey of 1,854 Senior Executives
KPMG Q1 AI Pulse Survey (2026): Agentic AI Deployment Triples Despite Risks
California Management Review, UC Berkeley (March 2026): Governing the Agentic Enterprise — A New Operating Model for Autonomous AI at Scale