Testing the tests

It’s been radio silence here for a couple weeks. Sorry about that. The truth is, all of my projects are coming to a head between now and mid-September, and there’s just not been any time. Today I thought I’d give a quick update about one of the projects I haven’t mentioned much, and which I think will probably make a big splash later this year.

Obviously, I’m a fan of standards. I think we have good evidence that students’ opportunities to learn are all-too-often driven by where they live or which teacher they happened to draw than by what we know (or at least think we know) about what they need to know in order to be successful in life. These opportunity to learn gaps existing not just across states, but within states, within districts, within schools, and very likely within classrooms. I also think as a practical matter that it’s remarkably inefficient for there to be 3 million unique interpretations of what’s important for kids to know. So I think standards are an important starting point for solving some of these problems.

Of course, I also think curriculum matters, and that’s why I’m studying textbooks and talking with teachers and district leaders about how they’re thinking about curriculum in this brave new world brought to us by the interwebs. Curricula bring standards to life and help concretize what can otherwise be sometimes frustratingly abstract language in standards. My hope is that students have equal access to good curriculum materials that offer faithful interpretations of the standards–whether that is the case remains to be seen.

The third leg of this little instructional tripod is the tests. The tests are intended to reinforce the content messages of the standards and to give teachers and parents accurate feedback about students’ performance. They’re also often used to make decisions about schools, (slightly less often) teachers, and (much less often) students. Now, we’ve known for quite a while that our tests weren’t that good. They’ve been made cheap, to test low-level skills using primarily (or exclusively) multiple choice items. And they haven’t offered useful feedback to anyone because results have generally arrived too late. The result is that the tests we’ve had have undermined the standards, rather than supporting them, leading to more reductionist responses from educators.

There are promising signs that the tests are looking better. The federal government pumped a large amount of money into the consortia and both PARCC and SBAC brought on the best of the best to help them build better tests. At the same time, other experienced players have gotten into the game, such as ACT. These tests are competing against each other, and they’re also competing against the best of the old state tests, such as Massachusetts MCAS. And they’re doing it for little to no more money than the mediocre state tests we had for years. While everything I’ve heard and read suggests these tests are indeed a pretty substantial step forward from what they’re replacing, states are in full retreat mode, tossing them aside left and right. To date, while there has been promising hints about the new tests, there just hasn’t been the kind of deep analysis of them as you might have hoped if your goal was ensuring the best evidence about the quality of the tests got into the hands of policymakers.

It’s in this context that I’m working with the Thomas B. Fordham Institute and HumRRO on a study evaluating these four assessments–PARCC, SBAC, ACT Aspire, and MCAS. We’re bringing together expert reviewers (educators and content experts) from around the country starting tomorrow for an intensive review of these tests’ content (their actual forms!), documentation, transparency, and accessibility. Later evaluations will examine their technical psychometric evidence. Our methodology was developed by the Center for Assessment and reviewed extensively by the project teams and by experts in measurement and assessment over the last year. It is based on the CCSSO’s Criteria for High Quality Assessments.

I’m so excited for this study starting tomorrow, not just because I’ll get my hands on real student test forms in a way that few folks have been fortunate enough to do, but also because it’ll be the first study to directly compare these tests to each other and against the research-based framework for what a good test should look like. I also happen to think the report, whatever it finds, is going to be useful to policymakers in states nationwide.

In the end, I’m sure that none of these tests will come out looking perfect. It’s my hope that they’ll all be strong along most dimensions so that we can say to states “these would be good choices if your goal was giving students a fair test that adequately covered the standards and gave teachers the right kinds of instructional messages.” If they end up looking no better than what we had before, it will further erode the already tenuous support the standards have among educators and the public, and it will likely do serious damage to the hope that a standards-based reform can really improve opportunity to learn for our kids.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s