Spikes in Agile: when and how to use them effectively

Spikes in Agile: when and how to use them effectively

Over 60% of Agile teams say their sprints routinely include stories they cannot credibly estimate, and the most common culprit is the unknown unknown: a library nobody has evaluated, an API no one has read, or an integra

Over 60% of Agile teams say their sprints routinely include stories they cannot credibly estimate, and the most common culprit is the unknown unknown: a library nobody has evaluated, an API no one has read, or an integration no one has prototyped. That is exactly what spikes in agile are built to solve — time-boxed research activities that convert fog into facts before a team commits to delivery. Done well, a spike turns a scary, unplannable story into a cleanly sized backlog item. Done poorly, it becomes an open-ended science project that burns a whole sprint with nothing to show for it. This guide covers when spikes help, when they hurt, how to write spike stories that produce usable outputs, and how AI coding assistants are quietly reshaping which spikes still earn their keep in 2026.

What is a spike in agile?

A spike in agile is a time-boxed research, prototyping, or investigation activity used to reduce uncertainty before a team commits to a complex user story. Originating in Extreme Programming (XP), the practice is now standard across Scrum, Kanban, and scaled frameworks such as SAFe and LeSS. The output of a spike is always knowledge — an estimate, a decision, a throwaway prototype, or a written recommendation — never a shippable feature.

The idea traces back to Kent Beck's original XP writings in the late 1990s, where a "spike solution" was a very simple program used to explore potential answers to a hard problem. Today, spikes appear as their own item type in Jira, Azure DevOps, and most Agile tooling, and SAFe formally categorizes them as Enabler Stories alongside architecture, infrastructure, and compliance work.

Two things make spikes different from ordinary user stories:

  • The goal is learning, not delivery. A spike's definition of done is answering a question, not passing acceptance tests written by a product owner.

  • The time box is non-negotiable. A spike without a hard limit is just unestimated research — the exact thing agile was invented to avoid.

Spike vs. user story at a glance

Teams routinely mix up the two. A user story describes value from the perspective of an end user ("As a clinician, I want…"). A spike exists to let the team deliver that user story safely. One produces shippable software; the other produces information that makes shippable software possible.

When should you run a spike in agile?

Run a spike when at least one of these four signals is present:

  1. The team cannot estimate the story with any reasonable confidence, even after refinement.

  2. The story has more than one plausible technical approach and the choice has long-term consequences (architecture, licensing costs, security posture).

  3. A third-party library, API, data source, or vendor has not yet been validated against the actual constraints of the product.

  4. The user need itself is still fuzzy — assumptions are stacking on assumptions.

If none of these apply, ordinary backlog refinement conversation is almost always cheaper and faster than a spike. Resist the urge to spike your way out of discomfort; a 30-minute whiteboard conversation beats a two-day investigation nearly every time.

Red flags that you are using spikes wrongly

  • Spikes appear in every sprint, for every epic.

  • The output of a spike is "more spikes."

  • Spike tickets sit in progress for 5+ sprints and get re-scoped.

  • Spikes are being used to hide work the team does not want to estimate.

  • The team is spiking because leadership demands certainty that agile cannot provide.

A useful diagnostic from hands-on coaching: if a team's ratio of spikes to delivery stories climbs above roughly 1:8 and stays there, the real problem is usually architectural debt, unclear product strategy, or a dysfunctional discovery process — not a need for more spikes.

Technical spike vs. functional spike: the two main types

Most teams only think in terms of "technical spikes," but the literature recognizes at least four distinct types. The two you encounter daily are the technical spike and the functional spike.

Technical spike: "how do we build it?"

A technical spike answers implementation questions. Examples of questions a technical spike exists to settle:

  • Will our current event bus handle 10× the current message volume without rework?

  • Which OAuth library in our stack best supports PKCE for mobile?

  • Is this new vector database compatible with our existing ORM and ops tooling?

  • What is the throughput of approach A versus approach B under production-like load?

A technical spike often produces a throwaway prototype plus a one-page recommendation. The prototype is a walking skeleton — just enough code to prove or disprove the approach, never meant for production.

Functional spike: "what should we build?"

A functional spike is customer-facing. It exists when the team or the product owner cannot yet confidently describe the behavior users need. Classic functional spike activities:

  • Clickable prototypes tested with five users.

  • Wizard-of-Oz experiments that simulate an unbuilt feature.

  • Competitive teardowns of similar flows.

  • Concierge MVPs that deliver the outcome manually before automating.

Functional spikes are where product discovery and engineering research blur. They belong to the product owner as much as the engineers.

Architectural and risk spikes

The Scaled Agile Framework (SAFe) explicitly calls out two more flavors. An architectural spike evaluates cross-cutting design options for an epic (for example, "monolith extraction path for the billing service"). A risk spike targets a specific fear — security, compliance, or regulatory uncertainty that would be disastrous to discover mid-implementation. In regulated sectors such as banking or healthcare, risk spikes are often the most valuable type of spike a team can run.

How to write a spike story that actually produces results

The single biggest driver of spike success is a well-written spike story. A sloppy spike story — "Investigate GraphQL" — is how two-week research tangents are born.

Use this template:

  • Title: Spike: the decision to be made, in one clean sentence.

  • Reason: The user story or epic the spike unblocks, and the concrete question the team cannot answer today.

  • Time box: Hours or days, never more than one sprint.

  • Owner: One named person accountable for producing the output, usually paired with a second reviewer.

  • Exit criteria: The specific deliverable that ends the spike — an estimate, a decision document, a prototype with metrics, or a spiked proof-of-concept.

  • Definition of done: A checklist of outputs the team agrees are sufficient to close the ticket.

A good spike story in practice

Spike: Choose a vector database for semantic search in the help center

  • Reason: Unblocks epic HELP-211. The team cannot estimate HELP-213 through HELP-217 until we know whether pgvector, Pinecone, or Weaviate meets our latency target at 2M embeddings.

  • Time box: 3 days.

  • Owner: Priya; reviewer Marcus.

  • Exit criteria: One-page recommendation with p95 latency and cost at 2M embeddings for each option, a decision, and three sized follow-on stories.

  • Definition of done: Recommendation reviewed by the tech lead; follow-on stories in the backlog with estimates; throwaway prototype deleted or moved to a scratch repo.

Contrast that with the bad version: "Spike: Research vector databases." The bad version has no exit criteria, no time box, no owner, and no clear successor stories. It is a near-guaranteed source of scope creep.

Should you estimate spikes?

This is one of the most contested questions in Scrum. Two defensible approaches exist:

  • Do not assign story points. Spikes are about learning, not value delivery. Counting them in velocity dilutes the metric. Instead, track hours or a fixed time box.

  • Assign small story points (usually 1–3) as a capacity proxy. Some teams do this to preserve capacity planning accuracy.

The pragmatic rule of thumb that repeatedly emerges in Scrum.org community discussion is simple: never use spikes to inflate velocity, and never run a spike that takes more than a single sprint.

Common mistakes that turn spikes into black holes

After years of coaching engineering teams through Agile transformations and fixes, the same spike-related anti-patterns appear over and over.

  1. No exit criteria. The team sets a time box but never defines what "done" looks like. Inevitably the spike ends with "we need more time."

  2. The spike becomes the feature. The prototype code ends up in production. Weeks later, the team is debugging shortcuts that were never meant to ship.

  3. Parallel spikes that compete. Three engineers each run their own version of the same research. You waste capacity and end up with three incompatible opinions.

  4. Spikes used to avoid estimation. Fearful teams tag every uncomfortable story as a spike to avoid committing. Over time, the backlog is all spikes and no delivery.

  5. Spikes without stakeholders. The product owner never sees the output, so the spike's conclusions drift back into the team's private knowledge and fail to influence the roadmap.

  6. No written artifact. The "finding" lives only in a Slack thread. Six weeks later, the decision is re-litigated because nobody can remember it.

The cheapest fix for all six: require that every spike ends with a short written decision record — an Architecture Decision Record (ADR) or a one-page Notion doc — pinned to the user story it unblocks.

Are spikes still relevant when AI can prototype in hours?

This is the most important shift happening in Agile practice right now. AI coding assistants — Cursor, Claude Code, GitHub Copilot, Devin, and the newer AI agents built into Jira and Azure DevOps — have dramatically changed the economics of research work. Some spikes that used to take two days now take two hours. A few genuinely disappear.

What AI kills

  • Pure "how does this library work" spikes. An engineer with an AI pair can read a library, write a toy integration, and benchmark it against an alternative in a single afternoon. Spikes that exist only to let a human skim documentation are increasingly wasteful.

  • Boilerplate prototypes. AI agents produce credible walking skeletons for standard integrations (REST, OAuth, CRUD, queue consumers) fast enough that the time box for a classic technical spike is shrinking from days to hours.

  • Throwaway comparison code. Running the same feature against three libraries used to take a sprint. AI assistants generate each variant in minutes; the bottleneck is now evaluating the output, not producing it.

What AI reinforces

  • Risk and architectural spikes. AI cannot tell you whether an approach will pass your ISO 27001 audit, survive your traffic peaks, or integrate with your legacy IAM system. The human judgment that drives risk and architectural spikes becomes more valuable, not less.

  • Functional spikes. Real user research has not gotten any faster. The discovery work behind a good functional spike — framing problems, running interviews, interpreting qualitative data — is still deeply human.

  • Evaluation discipline. When AI can generate three prototypes by lunchtime, the rate-limiter becomes your ability to choose well. That is exactly what a good spike exit criterion forces.

The practical implication: in 2026, teams should plan fewer, shorter, more focused spikes — and should explicitly use AI to compress the execution phase while keeping the framing, evaluation, and decision phases firmly under human control. This is exactly the kind of modernization FixAgile, an Agile training and implementation framework designed for the age of AI, helps teams embed. Many teams continue running pre-2020 spike patterns because nobody has taught them how to restructure the work around AI-accelerated prototyping.

Spikes in Scrum, Kanban, and SAFe: how the practice differs

Spikes show up across frameworks, but the rules of engagement vary.

Spikes in Scrum

Scrum itself does not define a spike — the Scrum Guide never uses the word. In practice, teams add spikes to the Product Backlog during refinement and pull them into a sprint like any other Product Backlog Item, with the constraint that they are fully contained within one sprint. Scrum.org guidance leans toward not assigning story points to spike stories; if you must, use a tiny number and exclude them from velocity trending.

Spikes in Kanban

Kanban teams handle spikes with less ceremony. A spike is usually its own class of service with a WIP limit (often one) and a defined maximum cycle time. Because Kanban teams do not commit to sprints, the practical time box becomes the policy that says "no spike stays open for more than 5 working days" — or whatever limit your team's flow analytics justify.

Spikes in SAFe

SAFe treats spikes as formal Enabler Stories that belong to the Program Backlog. They are planned during PI Planning, visible on the Program Board, and explicitly estimated so that Agile Release Trains can reason about capacity at scale. A mature SAFe implementation will dedicate a small, predictable percentage of each PI — typically 10–20% — to enablers, of which spikes are one kind.

How to run a spike end-to-end: a practical playbook

Here is the pattern that high-performing teams converge on, refined through years of coaching across regulated and non-regulated industries.

  1. Identify the decision. Name the single decision the spike will unblock. If you cannot write it in one sentence, you are not ready to spike.

  2. Write the spike story. Use the template above. Include an owner, a reviewer, a time box, and concrete exit criteria.

  3. Prep the workspace. Create a branch, a scratch repo, or a sandbox environment so any prototype code never accidentally ships.

  4. Timebox aggressively. Start with the smallest plausible time box. Only extend the time box once, and only with a written justification.

  5. Work in the open. Post daily updates in the team channel — a two-line "here is what I found today" is enough. This prevents surprise scope creep.

  6. Produce the written artifact. Close the spike with an Architecture Decision Record or a one-page recommendation. Link it directly from the user stories the spike unblocks.

  7. Refine the successor stories. In the next refinement session, turn the spike output into estimable user stories. If you cannot, the spike did not finish.

  8. Retrospect the spike. Once per quarter, review your last dozen spikes. Which hit their time box? Which produced decisions that stuck? This is the feedback loop most teams skip.

A well-run spike should feel almost boring: a clean question, a fast investigation, a clear answer, and a tidy handoff back into regular delivery.

Frequently asked questions about spikes in agile

Can a spike span multiple sprints?

No. If a single spike needs more than one sprint, it is too big. Break it into a sequence of smaller spikes, each with its own decision. Long spikes almost always mask a scope problem or a decision nobody wants to make.

Should spikes be demoed at sprint review?

Yes, when the output matters to stakeholders. A three-minute summary of findings and the resulting decision is a great sprint review slot — especially for functional spikes that shape the product direction.

Who writes spike stories?

Typically the engineer who will run the spike, with the product owner and tech lead reviewing. This is different from user stories, which are owned by the product owner. Spike authorship is collaborative because the question itself is usually technical.

How do I convince my manager that spikes are not "wasted" work?

Reframe spikes as cost-of-delay reduction. A two-day spike that saves the team from choosing the wrong vector database is dramatically cheaper than the six weeks required to rip out and replace that database after it fails in production.

Turn spikes into a competitive advantage, not a time sink

Spikes are one of Agile's most underused leverage points. Used deliberately, they turn terrifying unknowns into routine delivery; used sloppily, they become an excuse for never finishing anything. The teams that win in 2026 will be the ones that run fewer, shorter, better-framed spikes in agile — and that use AI coding assistants to compress execution while protecting the human judgment at the heart of risk, architectural, and functional research.

If your team's spikes routinely blow past their time boxes, stack up as unfinished tickets, or fail to translate into confident delivery estimates, the pattern is almost never fixed by telling people to "try harder." It is fixed by redesigning the practice — exit criteria, written artifacts, AI-assisted prototyping, and the refinement rituals around them. This is exactly the kind of hands-on modernization FixAgile's training programs are built to deliver. If your Agile transformation has stalled, your ceremonies have slipped into theater, or your teams are struggling to integrate AI into day-to-day work, starting with how you run spikes is one of the fastest ways to feel the difference.

Fix your Agile teamwork
in the age of AI.
Get practical guides on Scrum, Kanban, flow, scaling, and AI-augmented delivery.