Agile turns 25: TDD proves crucial for AI coding as security falls dangerously behind

TDD matters more than ever with AI: write tests first, then code. Standardize interfaces, gate agent changes in CI, and tighten security to ship with fewer surprises.

Categorized in: AI News IT and Development
Published on: Feb 21, 2026
Agile turns 25: TDD proves crucial for AI coding as security falls dangerously behind

From Agile to AI: Why Test-Driven Development Is Your Best Lever Right Now

Twenty-five years after the Agile Manifesto, a workshop hosted by Martin Fowler and Thoughtworks took a hard look at AI-native software development. The headline: test-driven development (TDD) matters more than ever. The session ran under the Chatham House Rule, but the themes were clear-engineering discipline isn't optional just because an agent writes the code.

Here's what matters for teams building with AI today, and how to put it to work without slowing your org to a crawl.

Why TDD + AI belongs together

TDD flips the flow: write the test first, then the code. That single move blocks a common AI failure mode-agents "prove" broken behavior with matching, equally broken tests. If the test predates the code, the agent can't game it.

  • Clarifies intent: tests become the spec AI must satisfy.
  • Prevents silent drift: failing tests flag when an agent changes behavior you didn't ask for.
  • Supports refactoring: ship changes fast without guessing what you broke.
  • Scales oversight: reviewers focus on test quality and coverage instead of diff spelunking.

Discipline doesn't disappear-it moves

When agents write code, rigor shifts to test design, contracts, architecture, data, and integration points. The report noted a familiar trap: teams add AI and expect speed, but the bottleneck moves to dependencies, architecture reviews, and product decisions.

Result without process change: same delivery speed, more frustration. To fix it, re-center governance around inputs and interfaces, not just code output.

  • Adopt explicit component contracts (APIs, schemas, error models) before coding starts.
  • Store decisions in lightweight ADRs and schedule brief architecture checkpoints.
  • Define golden datasets and fixtures so agents optimize for the right outcomes.
  • Gate all agent changes behind tests in CI. No green tests, no merge.

Standardize or let things drift?

Multiple agents will happily introduce conflicting patterns. Some variation is fine, but pick your lanes. Standardize the parts that compound across teams; allow freedom where it's local.

  • Standardize: API conventions, logging, error handling, security baselines, observability.
  • Allow variation: internal module structure and small-scale implementation details.
  • Provide templates/scaffolds so agents generate code that aligns with your defaults.

Team dynamics: seniors supervise, juniors ship faster with tools

Experienced engineers tend to be better at supervising agents because they see system-level tradeoffs. Many juniors pick up AI tools quicker because they're not anchored to old habits. Pair them.

  • Give seniors ownership of architecture, contracts, and test strategy.
  • Let juniors drive agent prompts, generation loops, and red-green-refactor cycles.
  • Use checklists for reviews: contract compliance, test relevance, observability, and security.

Security is dangerously behind

The pattern is familiar: "we'll fix security later." With AI, "later" comes with interest. Established tools and org structures are cracking under the throughput of AI-assisted work. Close the gap now.

  • Shift-left security: run SAST/DAST and secret scanning on every agent PR.
  • Threat-model prompts, tools, and data flows-not just the runtime service.
  • Enforce dependency and SBOM checks; block known-bad packages by policy-as-code.
  • Require provenance and signed builds (e.g., SLSA levels) for agent-generated code.
  • Add security test suites (negative tests, fuzzing) to your TDD base; map to OWASP ASVS.

The trust problem: AI is non-deterministic

People are feeling an identity shift in their work, and trust is still an open problem. You won't get perfect determinism, but you can box the uncertainty.

  • Stabilize generation: fix model versions, pin prompts, control temperature, and snapshot artifacts.
  • Build evaluation harnesses with golden tests and performance thresholds.
  • Roll out with feature flags and canaries, watch telemetry, and rehearse fast rollback.

Do we need a new manifesto?

Short answer from Fowler: too early. People are still trying things. Instead of a grand statement, double down on working practices-tests first, clear interfaces, and fast feedback loops.

A 30-day plan to make this real

  • Week 1: Pick one service. Write high-value tests first (happy paths, critical edge cases). Create a minimal style guide and CI gate on tests.
  • Week 2: Introduce an AI coding agent behind those tests. Add ADRs for any design choices. Start dependency and container scanning.
  • Week 3: Add negative and property-based tests. Do a focused threat model. Turn on secrets scanning and provenance checks.
  • Week 4: Ship a canary guarded by feature flags. Track lead time, change failure rate, and mean time to restore. Run a retro and set your standardization boundaries.

Further reading and practical next steps


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)