Agile Solved the Wrong Uncertainty

Agile did not fail. It solved a specific form of uncertainty — requirements uncertainty in deterministic systems. AI development operates under a different constraint: the build medium itself is probabilistic. The mismatch is structural. Once that is understood, most of the friction teams experience with Agile in AI systems becomes predictable.

I ran a deliberate experiment on my own pipeline work to test this. I imposed a standard Agile structure: time-boxed cycles, defined acceptance criteria, and a clear definition of done. The process held. The work was disciplined. But the acceptance criteria kept shifting—not because requirements changed, but because what “working” meant could not be defined until the model revealed its behavior. Within two cycles, I had shipped iterations that met every process requirement and failed every meaningful evaluation. The process optimized execution. The system required understanding.

This is not an execution failure. It is a category error.

The Narrative About Agile Is Incomplete

The dominant explanation for Agile dysfunction is poor implementation. Organizations adopt the rituals without the discipline. Velocity becomes a proxy for productivity. Retrospectives degrade into ceremony. By this account, Agile works in principle; the industry simply fails to execute it correctly.

There is truth here. Most large organizations run diluted versions of Agile. But this explanation locates the problem at the surface. It assumes the underlying model is still correct. It is not. Agile was never a universal framework. It was a precise response to a specific type of uncertainty.

Understanding that original constraint is what matters.

Agile Assumes Deterministic Execution

Agile was designed for requirements uncertainty in a deterministic build medium. That assumption is foundational. In traditional software, once requirements are known, execution is predictable. Code behaves deterministically. Given a clear specification, engineers can estimate scope, sequence work, and converge toward a defined outcome.

The problem Agile solved was misalignment. Teams built the wrong thing because requirements were poorly understood upfront. Agile compressed the feedback loop. Ship small increments, learn from users, adjust. The uncertainty was in what to build. The act of building was stable.

That distinction is rarely stated explicitly because it was taken for granted. It is, however, load-bearing. The entire structure of sprints, estimation, and acceptance criteria depends on it.

AI Breaks the Assumption

AI systems invert this model. Requirements are often clearer than they have ever been. Classify this input. Detect this pattern. Summarize this document. The ambiguity is not in the objective. It is in the system’s behavior.

The build medium is probabilistic.

Small changes in prompts, training data, or model configuration produce non-linear shifts in output. Systems that pass evaluation today can fail tomorrow under distribution shift. Pipelines that appear stable degrade silently as upstream data evolves. You are not executing a plan. You are interrogating a system.

This changes what “progress” means. You cannot define “done” purely in terms of feature completion because correctness is not binary. Outputs exist on distributions. What matters is not whether a task passes, but how the system behaves across a range of inputs.

Agile’s unit of progress is a shipped feature. In AI systems, the unit of progress is a reduction in uncertainty about model behavior.

The System Is an Evaluation Loop

Once the constraint is probabilistic behavior, the development system reorganizes around evaluation.

The core loop is:

hypothesis → implementation → evaluation → updated belief

Work begins with a belief about how the system behaves or could be improved. Implementation produces a candidate change. Evaluation measures its effect across a distribution of inputs. The outcome is not a feature. It is a decision: keep the change, revise it, or discard it.

This loop does not close on a fixed cadence. It closes when sufficient evidence accumulates. Sometimes that happens quickly. Sometimes it requires multiple iterations because the signal is weak or noisy.

The bottleneck is not producing changes. It is measuring what those changes actually did.

Why Agile Misaligns at the Process Level

Agile encodes a different loop:

plan → build → deliver → validate

This assumes that validation is straightforward. Either the feature meets its acceptance criteria or it does not. That assumption collapses in AI systems. Validation becomes statistical, delayed, and often ambiguous.

This is why common Agile constructs degrade.

Estimation fails because the time required depends on unknown system behavior. Acceptance criteria fail because correctness is distributional, not binary. Velocity fails because output volume is decoupled from system improvement.

The process continues to run, but it optimizes the wrong variable. Teams produce artifacts on schedule while remaining uncertain about system quality.

The Real Bottleneck Is Evaluation Infrastructure

The shift to probabilistic systems moves the constraint into evaluation.

To make progress, teams need representative datasets, consistent evaluation metrics, reproducible experiment tracking, and visibility into failure modes. Without this, iteration becomes guesswork. With it, iteration becomes directional.

This is the deeper structural shift. The limiting factor is not engineering throughput. It is the ability to observe, measure, and interpret system behavior. The faster a team can close the loop from change to understanding, the faster it improves the system.

Process alone does not solve this. Infrastructure does.

Why Agile Persists Anyway

If Agile misaligns with AI systems, why does it persist?

Because it solves a different problem: organizational coordination.

Agile provides a reporting interface. It translates work into units that management can track—stories, points, velocity, commitments. These abstractions work when output is predictable. They break when progress is non-linear and evidence-driven.

AI development resists this form of abstraction. A week of work may produce no visible artifact but significantly improve understanding. Another week may produce multiple features that degrade system performance. From a reporting perspective, this is difficult to reconcile.

Organizations continue to apply Agile not because it fits the system, but because it fits how organizations allocate resources and measure activity.

This creates a tension. The process that makes work legible is not the process that makes the system improve.

The Coordination Constraint Extends Here

This connects to a broader shift in software economics. As I argued in Why Small Teams Will Move Faster in the AI Era, AI compresses the cost of producing code while leaving coordination costs largely intact. The binding constraint moves away from production and toward evaluation and decision-making.

Agile was designed to optimize throughput under requirements uncertainty. AI systems require optimizing learning under behavioral uncertainty. Once the constraint moves, the process built for the previous constraint begins to misalign.

Small Teams Have a Structural Advantage

This mismatch amplifies with scale.

Small teams operate with shared context and short feedback loops. They can run tight experiment cycles without translating every action into formal artifacts. The distance between observation and decision is minimal.

Large teams require coordination layers. Work must be decomposed, tracked, and communicated across boundaries. When the unit of work is an experiment rather than a feature, this overhead slows the loop that matters.

The advantage of small teams is not just speed of execution. It is speed of understanding.

What Replaces Agile in Practice

The replacement is not the absence of process. It is a different process shape.

Cycles remain short, but they are structured around experiments, not features. Planning begins with hypotheses, not requirements. Reviews focus on evaluation results, not deliverables. Retrospectives ask whether the team interpreted data correctly, not whether it met commitments.

Some Agile elements survive in lighter form. Frequent syncs remain useful. Short planning horizons still match the rate of change. But estimation-heavy constructs and velocity tracking lose relevance because they measure the wrong thing.

The organizing principle shifts from delivery cadence to evaluation cadence.

The Constraint Has Moved

Every engineering process encodes an assumption about where uncertainty lives.

Waterfall assumed requirements were stable and execution was variable. Agile corrected that by treating requirements as uncertain and execution as stable. AI systems require a second correction. Execution itself becomes uncertain.

Agile did not fail. It is being applied to a system it was not designed for.

The teams that move fastest in the AI era will be the ones that recognize where the constraint has moved—and build their process, infrastructure, and decision-making around reducing uncertainty in system behavior, not around increasing the rate of feature delivery.

This post is part of an ongoing series on engineering for AI systems. In Why Small Teams Will Move Faster in the AI Era, I argued that AI shifts the binding constraint in software delivery from production to coordination. This piece extends that argument: once the constraint moves, the process built for the previous constraint begins to fail. Agile solved requirements uncertainty. AI systems are defined by behavioral uncertainty.