Friends don’t let friends misuse NAEP data

At some point the next few weeks, the results from the 2015 administration of the National Assessment of Educational Progress (NAEP) will be released. I can all but guarantee you that the results will be misused and abused in ways that scream misNAEPery. My warning in advance is twofold. First, do not misuse these results yourself. Second, do not share or promote the misuse of these results by others who happen to agree with your policy predilections. This warning applies of course to academics, but also to policy advocates and, perhaps most importantly of all, to education journalists.

Here are some common types of misused or unhelpful NAEP analyses to look out for and avoid. I think this is pretty comprehensive, but let me know in the comments or on Twitter if I’ve forgotten anything.

  • Pre-post comparisons involving the whole nation or a handful of individual states to claim causal evidence for particular policies. This approach is used by both proponents and opponents of current reforms (including, sadly, our very own outgoing Secretary of Education). Simply put, while it’s possible to approach causal inference using NAEP data, that’s not accomplished by taking pre-post differences in a couple of states and calling it a day. You need to have sophisticated designs that look at changes in trends and levels and that attempt to poke as many holes as possible in their results before claiming a causal effect.
  • Cherry-picked analyses that focus only on certain subjects or grades rather than presenting the complete picture across subjects and grades. This is most often employed by folks with ideological agendas (using 12th grade data, typically), but it’s also used by prominent presidential candidates who want to argue their reforms worked. Simply put, if you’re going to present only some subjects and grades and not others, you need to offer a compelling rationale for why.
  • Correlational results that look at levels of NAEP scores and particular policies (e.g., states that have unions have higher NAEP scores, states that score better on some reformy charter school index have lower NAEP scores). It should be obvious why correlations of test score levels are not indicative of any kinds of causal effects given the tremendous demographic and structural differences across states that can’t be controlled in these naïve analyses.
  • Analyses that simply point to low proficiency levels on NAEP (spoiler alert: the results will show many kids are not proficient in all subjects and grades) to say that we’re a disaster zone and a) the whole system needs to be blown up or b) our recent policies clearly aren’t working.
  • (Edit, suggested by Ed Fuller) Analyses that primarily rely on percentages of students at various performance levels, instead of using the scale scores, which are readily available and provide much more information.
  • More generally, “research” that doesn’t even attempt to account for things like demographic changes in states over time (hint: these data are readily available, and analyses that account for demographic changes will almost certainly show more positive results than those that do not).

Having ruled out all of your favorite kinds of NAEP-related fun, what kind of NAEP reporting and analysis would I say is appropriate immediately after the results come out?

  • Descriptive summaries of trends in state average NAEP scores, not just across a two NAEP waves but across multiple waves, grades, and subjects. These might be used to generate hypotheses for future investigation but should not (ever (no really, never)) be used naively to claim some policies work and others don’t.
  • Analyses that look at trends for different subgroups and the narrowing or closing of gaps (while noting that some of the category definitions change over time).
  • Analyses that specifically point out that it’s probably too early to examine the impact of particular policies we’d like to evaluate and that even if we could, it’s more complicated than taking 2015 scores and subtracting 2013 scores and calling it a day.

The long and the short of it is that any stories that come out in the weeks after NAEP scores are released should be, at best, tentative and hypothesis-generating (as opposed to definitive and causal effect-claiming). And smart people should know better than to promote inappropriate uses of these data, because folks have been writing about this kind of misuse for quite a while now.

Rather, the kind of NAEP analysis that we should be promoting is the kind that’s carefully done, that’s vetted by researchers, and that’s designed in a way that brings us much closer to the causal inferences we all want to make. It’s my hope that our work in the C-SAIL center will be of this type. But you can bet our results won’t be out the day the NAEP scores hit. That kind of thoughtful research designed to inform rather than mislead takes more than a day to put together (but hopefully not so much time that the results cannot inform subsequent policy decisions). It’s a delicate balance, for sure. But everyone’s goal, first and foremost, should be to get the answer right.

Not playing around on play

This weekend’s hot opinion piece was the New York Times’ “Let the Kids Learn through Play,” by David Kohn. This piece set up the (fairly tired) play vs. academics dichotomy, citing a panoply of researchers and advocates who believe that kindergarten has suddenly become more academic (and Common Core is at least partly to blame).

There will undoubtedly be many takedowns of this piece. An early favorite is Sherman Dorn’s, which notes the ahistorical nature of Kohn’s argument. Another critique I noticed going around the Twittersphere centered on the fact that there’s far more variation in kindergarten instruction among classrooms than there is between time periods (almost undoubtedly true, though I don’t have a link handy).

Early childhood is not my area, so I can’t get too deep on this one, but I did have a few observations.

  1. I think the evidence is reasonably clear at this point that kindergarten is becoming more “academic.” Daphna Bassok has shown this using nationally representative data, and I have found it in my own analyses as well. This means both that kids are spending a greater proportion of their time on academic subjects, and also that instruction within subjects is becoming more concentrated on more “traditional” approaches (e.g., whole class, advanced content) and less concentrated on more student-directed approaches.
  2. Any time you read an op-ed and you think “if they just flipped the valance on all these quotes, I bet they could find equally prominent researchers who’d support them,” you know you don’t have an especially strong argument. To put it mildly, my read of this literature is that it is far more contested than is described here. For instance, Mimi Engel and colleagues have several studies demonstrating that some of the advanced instructional content that comes under fire in Kohn’s piece and in the anti-academic-kindergarten crowd is the content most associated with greater student learning and longer-term success. Now, that doesn’t mean there might not be some tradeoffs (though I’d like to see those demonstrated before I’m willing to acknowledge them), but the literature is clearly not as one-sided as was portrayed here (and it may even be that the bulk of the quality evidence falls on the other side of this argument).
  3. As Sherman points out, this is also a ridiculous false dichotomy that is quite unhelpful. I don’t think anyone envisions kindergarten classes where students are mindless drones, drilling their basic addition facts all day long. Rather, many believe, and I think evidence suggests, that kindergarten students can handle academic content and that early development of academic skills can have long-lasting effects. Daphna perhaps put it best in an EdWeek commentary (you should read the whole thing if you haven’t already):

Our own research shows that children get more out of kindergarten when teachers expose them to new and challenging academic content. We are not arguing that most kindergartners need more exposure to academic content. At the same time, exposure to academic content should not be viewed as inherently at odds with young children’s healthy development.

I think this is exactly the right view, and one that was missed in the Times over the weekend.

[1] Save that link! I’m sure if you change the policy under question you can apply the text almost verbatim to most education op-eds.