Collecting Good Baseline Data - When and Why?

A baseline is the data you collect before an intervention — for example, recording how many times a student calls out during independent work across five sessions before introducing a differential reinforcement procedure.

Early in my career, I was working with a team that was struggling. The behaviour was frequent, the staff were stressed, and I had a pretty clear picture of what was driving it. I had a few strategies I was confident would help - not a full behaviour plan, but enough to make the next few days more manageable while I finished the assessment.

So I gave them the strategies.

My supervisor told me I shouldn’t have. It would contaminate the baseline data.

That answer bothered me. It still does.

What a Baseline Is Actually For

A baseline exists to answer a specific question: was this behaviour already changing before we intervened?

If a behaviour was already decreasing on its own, we can’t claim our intervention caused the reduction. If it was stable or increasing, and then we intervened and it dropped, that’s meaningful. The baseline is the reference point that makes the comparison possible.

That’s a legitimate scientific purpose. Baselines matter.

But in practice, they get treated as something more than that - as a kind of procedural purity, a waiting period you have to earn before you’re allowed to do anything. And that’s where it goes wrong.

The Ethics of Waiting

Our field talks a lot about timely, individualized, and least-restrictive support. We have ethics codes that speak to client welfare. We’re supposed to be acting in the best interests of the people we serve.

So when we have a strategy that might help - even marginally - and we withhold it because we want clean data, we’re making a trade-off that should at least be named explicitly.

We’re trading some amount of ongoing difficulty for the people in that environment - the client, the team - for a cleaner data picture.

Sometimes that trade-off makes sense. If the behaviour is low-frequency and low-risk, a proper baseline is worth the wait. If the assessment genuinely requires uncontaminated observation, that has to be weighed carefully.

But “it’ll contaminate the baseline” can also function as a reflexive answer - a protocol answer rather than a reasoned one. And when a team is struggling right now, the protocol answer isn’t always good enough.

The Measurement Problem Behind the Baseline Problem

Here’s what I think is actually happening in a lot of these conversations: the insistence on long, stable baselines is compensating for a blunt measurement tool.

If your data system gives you a percentage or a count per session, and you’re trying to determine whether behaviour changed meaningfully after an intervention, you need enough pre-intervention data points to feel confident in the trend. A handful of variable sessions won’t tell you much. So you collect more. And more. And call it science.

But if your measurement captures rate of change over time - celeration, in Precision Teaching terms - the picture looks different. You can see, directly on the chart, what the trend was doing before your phase change line and what it did after. The before-and-after story is in the data, not inferred from a comparison to a long baseline window.

This is part of what makes the Standard Celeration Chart a more honest tool than a simple line graph. It doesn’t just show you where a learner is. It shows you how fast they’re moving and in which direction. A phase change mid-chart doesn’t ruin the story - it’s part of the story.

Better measurement doesn’t make baselines irrelevant. It makes the question less urgent, less fraught, and less likely to become a reason to delay support.

A More Useful Frame

The baseline question is really three questions that often get collapsed into one:

1. Do we understand what’s maintaining the behaviour? This is an assessment question. You need enough observation to form a hypothesis. That might require a baseline period, or it might not - if the function is clear from context, from informant data, from a brief observation, you don’t always need weeks of formal data collection before you have a working hypothesis.

2. Will we be able to read what’s happening as we go? This is a measurement question, but it’s not about pre-intervention data. Behaviour change isn’t a before/after snapshot - it’s a continuous process. If your intervention is working, you’ll see an upward trend that develops and sustains over time. If it isn’t, the trend will tell you that too. The question isn’t whether you have enough baseline points to prove something later. It’s whether your measurement system is sensitive enough to read what’s actually happening right now, session by session.

3. Are there supports we can offer right now, before the plan is finalized? This is an ethics and clinical judgment question. The answer isn’t always yes, but it’s not always no either. A struggling team, a high-frequency behaviour, a strategy you’re confident in - that’s a context where waiting for baseline purity can be a harder call to defend than it appears.

Treating these as three separate questions - rather than one procedural answer - is where better practice starts.

The Bottom Line

Baselines are tools. They answer a specific question about whether behaviour was already changing before you stepped in. That’s worth knowing.

But they’re not sacred. They’re not an ethical shield. And they’re not a substitute for clinical judgment about when someone needs support.

The right time to intervene is informed by what you know, what you can measure, and what the people in front of you need right now. Sometimes that’s after a clean baseline. Sometimes it’s not.

A baseline is a tool. Use it like one.