Definition of done - why it’s not done yet

Definition of done - why it’s not done yet

Most Agile teams have a Definition of Done. Very few actually use one that works. So what is DoD in practice? It is supposed to be the shared quality standard that every Product Backlog item must meet before it can be co

Most Agile teams have a Definition of Done. Very few actually use one that works. So what is DoD in practice? It is supposed to be the shared quality standard that every Product Backlog item must meet before it can be considered a finished Increment — the team's non-negotiable quality contract. In reality, most teams treat their DoD as a forgotten checklist stapled to a wiki page, vague enough to mean anything and enforced loosely enough to mean nothing.

The result is predictable: "done" items that are not really done, technical debt compounding sprint after sprint, and release cycles filled with last-minute surprises. According to the State of Agile Report, 63% of organizations struggle with quality delivery — a number that has climbed 12 points in recent years. A weak Definition of Done is one of the biggest, and most overlooked, reasons why.

This article breaks down what a strong definition of done actually looks like, why most teams get it wrong, and how AI-accelerated delivery demands a stricter, more explicit DoD than ever before.

What is the Definition of Done in Agile?

The Definition of Done (DoD) is a formal description of the state of the Increment when it meets the quality measures required for the product. According to the Scrum Guide 2020, it is a commitment made by the Developers on a Scrum Team — the same way the Sprint Goal commits them to purpose and the Product Goal commits the Product Owner to direction.

The Scrum Guide is explicit: if a Product Backlog item does not meet the Definition of Done, it cannot be released or even presented at the Sprint Review. It goes back to the Product Backlog for future consideration.

In simpler terms, the DoD is the team's quality contract. It defines the minimum standard every piece of work must reach before it can be called complete. It is not about what the feature does — that is what acceptance criteria cover. It is about whether the work meets the team's agreed quality bar for things like code quality, testing, documentation, and deployment readiness.

Definition of Done vs. acceptance criteria

One of the most common sources of confusion in Scrum teams is the difference between the Definition of Done and acceptance criteria. Here is the simplest way to think about it:

  • Definition of Done applies to all work. It is horizontal — the same quality checklist applies to every Product Backlog item regardless of what it does.

  • Acceptance criteria apply to specific user stories or features. They are vertical — they define what a particular piece of work must do to satisfy the customer or Product Owner.

For example, "all code must pass peer review and automated regression tests" belongs in the DoD. "The user can reset their password via email" belongs in acceptance criteria for a specific story.

A strong Agile team needs both. The DoD ensures consistent quality. Acceptance criteria ensure each feature delivers the right value. Problems start when teams blur the two — or worse, rely on acceptance criteria alone and have no real DoD at all.

Why most teams' Definition of Done is broken

If you have coached or led more than a handful of Agile teams, you have seen the pattern. A team creates their DoD during an early retrospective or workshop, writes it on a whiteboard or wiki, and then gradually stops referencing it. Within a few sprints, "done" becomes whatever the team can get past the Sprint Review without anyone objecting.

Here are the most common reasons this happens.

The DoD is too vague

Statements like "code is tested" or "documentation is updated" sound reasonable. But tested how? Unit tests? Integration tests? Performance tests? Updated where? In the API docs? The user guide? The internal wiki? Vague criteria are impossible to enforce consistently because everyone interprets them differently. A strong DoD Scrum teams can actually use must be specific and verifiable.

The DoD never evolves

Many teams create their Definition of Done once and never revisit it. But team capabilities change. Infrastructure changes. The product matures. What was an appropriate DoD for a three-person startup building an MVP is not appropriate for a 40-person engineering org with enterprise customers. The DoD should be reviewed and updated regularly — ideally during retrospectives — as the team's ability to deliver quality increases.

Nobody enforces it

A DoD that is not enforced is just decoration. When deadlines tighten, teams under pressure start cutting corners: skipping code reviews, deferring test coverage, pushing items to "done" with known issues. If the Scrum Master does not hold the line on the DoD, it degrades fast. And once it degrades, the team loses the transparency the DoD was supposed to create.

It is disconnected from organizational standards

If the organization has its own quality standards — security requirements, compliance checks, accessibility criteria — these need to be reflected in every team's DoD. The Scrum Guide states that if the DoD for an Increment is part of the standards of the organization, all Scrum Teams must follow it as a minimum. Teams that create a DoD in isolation often miss these standards entirely, which creates friction and rework at release time.

Definition of Done examples by team type

One reason teams struggle with their DoD is that most guides offer generic examples. A strong definition of done looks very different depending on what your team actually builds. Here are concrete examples for different contexts.

Software development team DoD

  1. Code is peer-reviewed and approved by at least one other developer

  2. All unit tests pass with minimum 80% coverage on new code

  3. Integration tests pass in the staging environment

  4. No critical or high-severity bugs remain open

  5. API documentation is updated for any changed endpoints

  6. Feature is deployed to the staging environment and smoke-tested

  7. Performance does not degrade beyond established baselines

  8. Security scan completed with no new critical vulnerabilities

Data or machine learning team DoD

  1. Model is trained and validated against the agreed test dataset

  2. Model performance meets or exceeds agreed metrics (accuracy, precision, recall)

  3. Data pipeline changes are tested with sample and edge-case data

  4. Model drift monitoring is configured for production

  5. Documentation covers training data sources, feature engineering, and model parameters

  6. Bias and fairness checks completed

  7. Model is deployed to staging and outputs verified against expected behavior

Design or UX team DoD

  1. Design meets the requirements defined in the user story

  2. Design reviewed and approved by Product Owner

  3. All design assets exported in required formats

  4. Responsive behavior defined for all target breakpoints

  5. Accessibility review completed (WCAG 2.1 AA minimum)

  6. Design system components updated if new patterns were introduced

  7. Developer handoff documentation is complete

These examples are not exhaustive — each team should adapt their DoD to their specific product, tech stack, and organizational context. The important principle is specificity. Every item on the list should be something a team member can objectively verify.

How AI-accelerated delivery changes the Definition of Done

The rise of AI-assisted development is quietly breaking the traditional Definition of Done. When AI tools can generate code, write tests, produce documentation, and suggest refactors in minutes, delivery speed increases dramatically. But speed without quality controls is just faster technical debt.

Here is the problem: most existing DoD checklists were designed for a world where a developer manually writes, tests, and documents every change. In an AI-accelerated workflow, the volume and pace of output is fundamentally different — and the DoD must adapt.

New DoD criteria for AI-augmented teams

Teams integrating AI into their workflows need to add explicit quality gates that address AI-specific risks:

  • AI-generated code review. Code produced by AI tools (Copilot, Cursor, Claude, etc.) must receive the same — or more rigorous — peer review as human-written code. AI-generated code can introduce subtle bugs, security vulnerabilities, or anti-patterns that look correct at first glance.

  • Test validation for AI outputs. Tests written or suggested by AI should be reviewed for meaningful coverage. AI tools sometimes generate tests that pass but do not actually validate the right behavior — they test the implementation rather than the intent.

  • Intellectual property and licensing checks. AI-generated code can inadvertently reproduce licensed or copyrighted code from training data. Teams working in regulated or IP-sensitive environments should include a licensing review step.

  • Prompt and context documentation. When AI tools are used to generate significant portions of code or architecture, documenting the prompts, context, and rationale helps future developers understand the "why" behind the code, not just the "what."

  • Drift detection for AI-powered features. For teams building products that use AI models, the DoD should include monitoring for model drift — because an AI feature that was "done" at deployment can degrade over time as data patterns shift.

As one Agile Alliance article noted, teams developing AI products need to redefine "done" from output to outcome evidence — proving not just that the model works, but that it works correctly and does not learn the wrong things.

Why your DoD needs to be stricter, not looser

There is a tempting argument that AI makes delivery so fast that teams can afford to be less rigorous — ship fast, fix later. This is a trap.

When delivery speed doubles, the cost of a weak DoD also doubles. Every incomplete item, every unreviewed piece of AI-generated code, and every skipped test compounds faster. The teams that are successfully integrating AI into Agile workflows are not loosening their DoD — they are making it more explicit and more automated.

The best approach is to automate as many DoD checks as possible. Static analysis, security scanning, test coverage thresholds, accessibility audits — these should run automatically in CI/CD pipelines so the DoD is enforced by tooling rather than relying on human discipline alone. AI can even help here: tools can flag when a pull request does not meet DoD criteria before it reaches a reviewer.

How to build a Definition of Done that actually works

If your team's current DoD is not delivering the quality you need, here is a practical framework for rebuilding it.

Step 1: audit your current state

Before writing a new DoD, look at what is actually happening. Pull the last 10 items your team marked as "done" and ask:

  • Did any of these come back as bugs or rework within two sprints?

  • Were any of these released with known quality gaps?

  • Did any skip a step that everyone assumed was being done?

This audit reveals the gap between your written DoD and your actual DoD. That gap is where quality problems live.

Step 2: make every item specific and verifiable

Replace vague statements with specific, pass-or-fail criteria. Instead of "code is tested," write "all new code has unit tests with a minimum of 80% branch coverage, and all existing regression tests pass." Instead of "documentation is updated," write "API changelog is updated and user-facing help articles are revised if the feature changes existing behavior."

Step 3: layer it by scope

Consider having DoD criteria at multiple levels:

  • Story-level DoD: The minimum quality every individual Product Backlog item must meet

  • Sprint-level DoD: Additional criteria the Sprint Increment must meet as a whole (e.g., full regression suite passes, release notes drafted)

  • Release-level DoD: Criteria for production readiness (e.g., load testing completed, rollback plan documented, stakeholder sign-off)

This layered approach prevents the DoD from becoming an overwhelming mega-list while still ensuring quality at every stage.

Step 4: automate what you can

Every DoD criterion that can be enforced by tooling should be. Automated checks are more consistent than human checks, and they scale as delivery speed increases — which is especially critical for teams using AI-assisted development.

Step 5: review and evolve it regularly

The DoD should be a standing topic in retrospectives. Ask the team: "Is our Definition of Done still protecting us from quality issues? What should we add? What is no longer relevant?" A DoD that evolves with the team's maturity is far more effective than a static document.

The role of the Scrum Master in protecting the DoD

The Scrum Master is the primary guardian of the Definition of Done. This does not mean writing the DoD for the team — the Developers own it. But it does mean:

  • Coaching the team on why the DoD matters and how to improve it

  • Holding the line when pressure mounts to skip criteria for the sake of hitting a Sprint Goal

  • Facilitating conversations during retrospectives about DoD effectiveness

  • Working with the organization to ensure organizational standards are reflected in the DoD

One of the most impactful things a Scrum Master can do is simply ask, in every Sprint Review: "Does this Increment fully meet our Definition of Done?" That single question creates the transparency and accountability the DoD is designed to provide.

For Scrum Masters navigating both traditional Agile challenges and the new demands of AI-accelerated teams, AgileRestart's training programs — an Agile training and implementation framework designed for the age of AI — provide hands-on coaching on modernizing practices like the DoD for a world where humans and AI work side by side.

Common mistakes to avoid with your DoD

Even well-intentioned teams fall into recurring traps with their Definition of Done:

  • Making it too long. A 30-item checklist creates fatigue and leads to items being rubber-stamped rather than genuinely verified. Keep it focused on the criteria that genuinely protect quality.

  • Treating it as optional. The DoD is not a suggestion. If work does not meet the DoD, it is not done — period. Making exceptions "just this once" is how quality erosion begins.

  • Copying another team's DoD. What works for a fintech startup will not work for an enterprise healthcare team. The DoD must be tailored to your team's product, tech stack, and organizational context.

  • Ignoring it during planning. The DoD should inform Sprint Planning. If the team does not have capacity to meet the DoD for everything in the Sprint Backlog, the Sprint Backlog is too big. A realistic Sprint plan accounts for the time required to meet quality standards, not just feature development.

The Definition of Done is your team's quality floor

The Definition of Done is not a bureaucratic hurdle. It is the single most important tool your team has for delivering consistent, trustworthy Increments. When the DoD is clear, specific, enforced, and evolving, teams spend less time on rework, have fewer release surprises, and build genuine confidence in what "done" actually means.

In 2026, the stakes are higher than ever. AI is accelerating delivery speed across the industry, which means teams without a strong DoD will accumulate quality problems faster than ever before. The organizations that thrive will be the ones that treat their DoD as a living, actively maintained quality standard — not a dusty artifact from an onboarding workshop.

If your Agile transformation has stalled, your teams struggle with inconsistent quality, or you are figuring out how to integrate AI into your workflows without sacrificing standards, this is exactly what AgileRestart's training programs and AI-readiness assessments are built to solve. A stronger Definition of Done is not just a better checklist — it is the foundation for delivery you can trust.

Fix your Agile teamwork
in the age of AI.
Get practical guides on Scrum, Kanban, flow, scaling, and AI-augmented delivery.