My visit to Success Academies

On Wednesday I had the pleasure of visiting Success Academy Harlem 1 and hearing from Eva Moskowitz and the SA staff about their model. I’m not going to venture into the thorny stuff about SA here. What I will say is that their results on state tests are clearly impressive, and that I doubt that they’re fully (or even largely) explained by the practices that cause controversy (and luckily we’ll soon have excellent empirical evidence to answer that question).

Instead, what I’m going to talk about briefly is the fascinating details I saw and heard about curriculum and instruction in SA schools. Of course right now it is impossible to know what’s driving their performance, but these are some of the things I think are likely to contribute. [EDIT: I’d forgotten Charles Sahm wrote many of these same things in a post this summer. His is more detailed and based on more visits than mine. Read it!]

What I saw in my tour of about a half-dozen classrooms at SA 1:

  • The first thing I observed in each classroom is the intense focus on student discourse and explanation. In each classroom, students are constantly pressed to explain their reasoning, and other students respond constructively and thoughtfully to the arguments of their peers. This “pressing for mastery” is one of the key elements of the SA vision of excellence, as I later learned.
  • Students are incredibly organized and on-task. They sit quietly while others are speaking and then, when prompted by the teachers to begin discussion in pairs, they immediately turn and address the question at hand. I saw virtually no goofing off or inattention in the classes I observed. This includes in a Pre-K classroom. To facilitate the structure and organization I saw lots of timers–everything was timed, starting and stopping in the exact amount of time indicated by the teacher.
  • The actual math content I observed being taught was clearly on-grade according to Common Core. In a third grade classroom I saw students working on conceptual explanations of fraction equivalence for simple cases (2/3 = 4/6); this comes right out of the third grade standards. I later learned that there is a strong focus on both problem-solving ability and automaticity in SA classrooms.
  • We were walking around with the school’s principal, and it was clear that she spends a great deal of her time moving in and out of classrooms observing. More than a passive observer, she interjected with pedagogical suggestions for the teacher in almost every class we visited. The teachers all seemed used to this kind of advice, and they implemented it immediately.

What I heard from Eva and her staff about curriculum and instruction in SA schools:

  • The curricula they use are all created in-house. They evaluated a bunch of textbooks in each subject and found them all wanting, so they created their own materials.
  • The math materials are influenced by TERC and Contexts for Learning. They do not use off-the-shelf math textbooks because they find them all highly redundant (something I’ve found in the context of instruction), the apparent assumption from publishers being kids won’t get it the first time (this was described as signifying publishers’ “low expectations”).
  • The ELA materials are based on close reading and analysis, and have been since the first SA school opened in 2006. The goals I heard were for students to 1) love literature and want to read, and 2) be able to understand what they’re reading. These goals are accomplished by a good deal of guided close reading instruction, child-chosen books (every classroom had a beautiful and well stocked library), and daily writing and revising in class. There seemed to be a clear and strong opposition to “skills-based” reading instruction.
  • The only off-the-shelf materials that they use in ELA and mathematics are Success for All’s phonics curriculum, which is used in grades K and 1.
  • Every kid in elementary grades gets inquiry-based science instruction every day. They have dedicated science teachers for this. They also get art, sports, and chess in the elementary grades.
  • The curriculum is uniform across the schools in the network. Every teacher teaches the same content on the same day. The lessons are not scripted, however. The curricula are revised at the network level every year.
  • A typical lesson is 10 minutes of introduction with students on the floor, some of which will be teacher lecture and some of which will be discussion; 30 minutes of students working individually or with partners; and 10 minutes of wrap up and additional discourse. The goal for the whole day is less than 80 minutes of direct instruction.
  • Teachers get tons of training, and the training is largely oriented toward curriculum and instruction. They also get 2 periods of common planning time with other grade-level teachers per day, and an afternoon to work together on planning and training.
  • The new New York state math test was much derided as too easy and not actually indicating readiness for success in high school and beyond.
  • There is not nearly as much of a testing and data-driven culture as I expected in this kind of school. Testing seems to legitimately be a means to an end, and I didn’t get the sense that lots of instructional time was used up in testing. Rather, judgments about student readiness seemed to be largely qualitative.
  • The only tracking that currently happens in network schools is in mathematics starting in middle school, where there are two tracks (regular and advanced).

So, that’s what I saw and what I heard. From and C&I standpoint, the things that really stood out to me were a) the organization, which made things flow smoothly and diminished distractions, b) the common content across classrooms (created by network staff and teachers), coupled with time to plan and share results, c) the involvement of the school leader in constantly observing instruction, and d) the, frankly, much more “progressive” and “student-led” approach to instruction than I envisioned.

It was a fascinating experience that I hope others can have.

Hufflin Muffin (or the craziness of textbook data)

Sorry for the absence; it’s been a crazy month.

The main work keeping me away from here continues to be my research on school districts’ textbook adoptions. Recently, we had a nice breakthrough in Texas, where we discovered that the state keeps track of districts’ textbook purchases through disbursement and requisition reports. Great news! A couple FOIAs later and we were in business.

But like everything in the textbook research business, each step forward leads to two steps back.

The disbursement dataset we got from the state contains information on the publishers of the textbooks purchased by each Texas school district. Not the titles themselves, but the publishers. You’d think that a dataset like this might be standardized in some way. After all, there are only so many publishers out there. And if you’re going to the trouble of collecting the data as a state, you might want to have the data be easily usable (either by you or by researchers).

Well, you’d be wrong. Very, very wrong. My student, Shauna Campbell, has been cleaning up these data. Below I have copied the list of different entries in the “publishers” variable that we believe correspond to Houghton Mifflin Harcourt. There are 313 as of today’s counting, and we expect this number to go up. I’ve bolded some of our personal favorite spellings.

What is the point of collecting data like this if it’s going to be so messy as to be almost unusable?

·       Hooughton Mifflin
·       Houfhron Mifflin Hartcourt
·       Houfhton Mifflin Harcourt
·       Houg
·       HOUG
·       hough
·       Hough Mifflin harcourt
·       Houghjton Mifflin Harcourt
·       Houghlin Mifflin Harcourt
·       Houghlton Mifflin Harcourt
·       Houghnton Mifflin Harcourt
·       Houghon Mifflin
·       Houghon Mifflin Harcourt
·       Houghrton Mifflin Harcourt
·       Hought Mifflin Hardcourt
·       Hought on Mifflin Hacourt
·       Houghtion Mifflin Harcourt
·       Houghtlon Mifflin Harcourt
·       Hought-Mifflin
·       Houghtn Mifflin
·       Houghtn Mifflin Harcourt
·       Houghtn Miffling Harcourt
·       Houghtno Mifflin Harcourt
·       HOUGHTO MIFFLIN HARCOURT
·       Houghto Mifflin Harcourt
·       Houghtob Mifflin
·       HOUGHTOM MIFFLIN
·       Houghtom Mifflin
·       Houghtom Mifflin Harcourt
·       HOUGHTOM MIFFLIN HARCOURT
·       Houghtomn Mifflin Harcourt
·       Houghton
·       HOUGHTON MIFFLIN
·       Houghton Mifflin
·       Houghton Mifflin Harcourt
·       Houghton MIfflin Harcourt
·       Houghton Mifflin Hardcort
·       Houghton & Mifflin
·       Houghton Harcourt Mifflin
·       Houghton Hiffin Harcourt
·       Houghton Hifflin
·       Houghton Hifflin Harcourt
·       Houghton Lifflin
·       Houghton McDougal
·       Houghton Mfflin Harcourt
·       Houghton Mfifflin
·       Houghton Miffen Harcourt
·       Houghton Miffflin Harcourt
·       Houghton Miffiin Harcourt
·       Houghton Miffilin
·       Houghton Miffilin Harcourt
·       Houghton Miffiln Harcourt
·       Houghton Miffin
·       Houghton Miffin Harcourt
·       HOUGHTON MIFFIN HARCOURT
·       Houghton Miffin Harcourt/Saxon Publis..
·       Houghton Mifflan
·       Houghton Mifflan Harcourt
·       Houghton Mifflein Harcourt
·       Houghton Miffliin Harcourt
·       HOUGHTON MIFFLIIN HARCOURT
·       Houghton Miffliln
·       Houghton Miffliln Harcouirt
·       Houghton Miffliln Harcourt
·       HOUGHTON MIFFLIM
·       houghton mifflin
·       Houghton MIfflin
·       houghton Mifflin
·       Houghton mifflin
·       HOughton Mifflin
·       Houghton- Mifflin
·       Houghton Mifflin – Grade 1
·       Houghton Mifflin – Grade 2
·       Houghton Mifflin – Grade 3
·       Houghton Mifflin – Grade 4
·       Houghton MIfflin – Grade 4
·       Houghton Mifflin – Grade 5
·       Houghton Mifflin – Grade 6
·       Houghton MIfflin – Grade 6
·       Houghton Mifflin – Grade 7
·       Houghton Mifflin – Grade 8
·       Houghton Mifflin – Grade K
·       Houghton Mifflin Harcourt
·       HOUGHTON MIFFLIN HARCOURT
·       Houghton Mifflin / Great Source
·       Houghton Mifflin and Harcourt
·       Houghton Mifflin Co
·       Houghton Mifflin Co.
·       Houghton Mifflin College Dic
·       Houghton Mifflin College Div
·       Houghton Mifflin Company
·       houghton mifflin company
·       HOUGHTON MIFFLIN COMPANY
·       Houghton Mifflin from Follett
·       Houghton Mifflin Geneva, IL 60134
·       Houghton Mifflin Grade 8
·       Houghton Mifflin Grt Souce ED Grp
·       Houghton Mifflin Hacourt
·       Houghton Mifflin Haecourt/Holt McDougal
·       Houghton Mifflin Haracourt
·       HOUGHTON MIFFLIN HARCCOURT
·       Houghton Mifflin Harccourt
·       Houghton Mifflin Harcocurt
·       Houghton Mifflin Harcort
·       HOUGHTON MIFFLIN HARCORT
·       Houghton Mifflin Harcount
·       Houghton Mifflin Harcour
·       Houghton Mifflin Harcourft
·       Houghton mifflin Harcourt
·       HOughton Mifflin Harcourt
·       Houghton Mifflin harcourt
·       Houghton mifflin harcourt
·       houghton Mifflin Harcourt
·       houghton Mifflin harcourt
·       Houghton Mifflin HArcourt
·       HOUGHTON Mifflin Harcourt
·       houghton mifflin Harcourt
·       Houghton Mifflin Harcourt – 9791300126
·       Houghton Mifflin Harcourt – 9791300173
·       Houghton Mifflin Harcourt – 9791300184
·       Houghton Mifflin Harcourt – Great Sou..
·       Houghton Mifflin Harcourt — Saxon
·       Houghton Mifflin Harcourt (Saxon)
·       Houghton Mifflin Harcourt (Steck Vaug..
·       Houghton Mifflin Harcourt (TEXTBOOK W..
·       Houghton Mifflin Harcourt / Holt McDo..
·       Houghton Mifflin Harcourt / Rigby
·       Houghton Mifflin Harcourt 9205 S. Par..
·       Houghton Mifflin Harcourt Achieve Pub..
·       Houghton Mifflin Harcourt Co
·       Houghton Mifflin Harcourt Co.
·       Houghton Mifflin Harcourt Depository
·       Houghton Mifflin Harcourt Great Source
·       Houghton Mifflin Harcourt -Holt
·       Houghton Mifflin Harcourt- Holt McDou..
·       Houghton Mifflin Harcourt- Holt McDoug
·       Houghton Mifflin Harcourt Holt McDoug..
·       Houghton Mifflin Harcourt Holt McDougal
·       Houghton Mifflin Harcourt Mifflin Har..
·       HOUGHTON MIFFLIN HARCOURT PUBLISHING
·       Houghton Mifflin Harcourt Publishing
·       Houghton Mifflin Harcourt Publishing ..
·       HOUGHTON MIFFLIN HARCOURT PUBLISHING ..
·       Houghton Mifflin Harcourt Publishing ..
·       HOUGHTON MIFFLIN HARCOURT PUBLISHING CO
·       Houghton Mifflin Harcourt Publishing Co
·       Houghton Mifflin Harcourt Rigby
·       Houghton Mifflin Harcourt Riverside
·       Houghton Mifflin Harcourt Saxon
·       Houghton Mifflin Harcourt School Publ..
·       Houghton Mifflin Harcourt Texas
·       Houghton Mifflin Harcourt/ Holt McDou..
·       Houghton Mifflin Harcourt/Great Source
·       Houghton Mifflin Harcourt/Holt
·       Houghton Mifflin Harcourt/Holt McDougal
·       Houghton Mifflin Harcourt/Saxon Publi..
·       Houghton Mifflin Harcourt-9791300126
·       Houghton Mifflin Harcourte
·       HOUGHTON MIFFLIN HARCOURTE
·       Houghton Mifflin HarcourtH
·       Houghton Mifflin HarcourtMH
·       Houghton Mifflin Harcourtq
·       Houghton Mifflin Harcourt-SHIPPING Co..
·       Houghton Mifflin Harcout
·       HOUGHTON MIFFLIN HARCOUT
·       Houghton Mifflin Harcpurt
·       Houghton Mifflin Harcuort
·       Houghton Mifflin Harcurt
·       Houghton Mifflin Hardcourt
·       houghton mifflin hardcourt
·       houghton Mifflin Hardcourt
·       Houghton Mifflin Harocurt
·       Houghton Mifflin Harourt
·       Houghton Mifflin Harrcourt
·       Houghton Mifflin Hart Court
·       Houghton Mifflin Hartcourt
·       HOUGHTON MIFFLIN HARTCOURT
·       Houghton Mifflin Hartcourt Brace
·       Houghton Mifflin Holt Physics
·       Houghton Mifflin Holt Seventh Math
·       Houghton Mifflin Holt Sixth
·       Houghton Mifflin Holt Biology
·       Houghton Mifflin Holt Eighth
·       Houghton Mifflin Holt Eighth Math
·       Houghton Mifflin Holt Fifth
·       Houghton Mifflin Holt First
·       Houghton Mifflin Holt Fourth
·       Houghton Mifflin Holt IPC
·       Houghton Mifflin Holt Kindergarten
·       Houghton Mifflin Holt McDougal
·       Houghton Mifflin Holt Modern Chemistry
·       Houghton Mifflin Holt Second
·       Houghton Mifflin Holt Seventh
·       Houghton Mifflin Holt Sixth Math
·       Houghton Mifflin Holt Third
·       Houghton Mifflin Publishing
·       HOUGHTON MIFFLIN PUBLISHING COMPANY
·       Houghton Mifflin Publishing Company
·       Houghton Mifflin School
·       Houghton Mifflin Science
·       Houghton Mifflin, Eds.
·       Houghton Mifflin, Indianapolis, IN 46..
·       Houghton Mifflin/Harcourt
·       Houghton Mifflin/Harcout
·       Houghton Mifflin/Holt McDougal
·       Houghton Mifflin/McDougal
·       Houghton Miffline
·       Houghton Miffline Harcourt
·       Houghton Miffline/Harcourt
·       Houghton Miffling Harcourt
·       Houghton MIffling Harcourt
·       Houghton Mifflin-Great Source
·       Houghton Mifflin-Great Source Rigby
·       Houghton MifflinHarcourt
·       HOUGHTON MIFFLINHARCOURT
·       Houghton MifflinHArcourt
·       houghton MifflinHarcourt
·       Houghton Mifflin-Harcourt
·       Houghton Mifflin-Holt McDougal
·       Houghton Mifflini Harcourt
·       Houghton Mifflinn Harcourt
·       Houghton Mifflin-using overage
·       Houghton Miffllin
·       Houghton Miffllin Harcourt
·       Houghton Miffln
·       Houghton Miffln Harcourt
·       Houghton Miflfin Harcourt
·       Houghton Miflin
·       Houghton Miflin Harcourt
·       HOUGHTON MIFLIN HARCOURT
·       Houghton Mifllin Harcourt
·       Houghton Migglin Harcourt
·       Houghton Miifflin Harcourt
·       Houghton Millflin
·       Houghton Mimfflin Harcourt
·       Houghton Misslin
·       Houghton Mofflin
·       Houghton Mufflin
·       Houghton Mufflin Company
·       Houghton Mufflin Harcourt
·       Houghton, Mifflin Harcort
·       Houghton, Mifflin HArcort
·       Houghton, Mifflin Harcourt
·       Houghton, Mifflin, and Harcourt
·       Houghton, Mifflin, Harcourt
·       Houghton-Miffliin Harcourt
·       HOUGHTON-MIFFLIN
·       Houghton-MIfflin
·       Houghton-Mifflin Co.
·       Houghton-Mifflin Company
·       Houghton-Mifflin Great Source Rigby
·       HoughtonMifflin Harcourt
·       Houghton-Mifflin Harcourt Senderos
·       Houghton-Mifflin/Great Source Rigby
·       Houghton-Mifflin/Harcourt
·       Houghton-Miffline Harcourt
·       HoughtonMifflinHarcourt
·       Houghton-Mifflin-Harcourt
·       Houghton-Mifflin-Harcourt;BMI;Lakesho..
·       HoughtonMifflinHarcourt-Holt McDougal
·       HoughtonMifflinHarcourt-Holt-McDougal
·       Houghton-Miflin
·       HOUGHTON-MUFFLIN
·       Houghtopn MIfflin Harcourt
·       HoughtotnMifflin
·       Houghtton Miffin Harcourt
·       Houghtton Mifflin Harcourt
·       Houghyon Mifflin Harcourt
·       Hougnton Mifflin Haracourt
·       hougnton Mifflin Harcourt
·       Hougthon MIfflin
·       Hougthon Mifflin Harcourt
·       HOUGTHON MIFFLIN HARCOURT
·       Hougthon Mifflin harcourt
·       hougthon mifflin harcourt
·       Hougthon Mifflin Haroourt
·       Hougton
·       Hougton Mifflin
·       Hougton Mifflin Harcourrt
·       HOUGTON MIFFLIN HARCOURT
·       Hougton-Mifflin
·       Hougtton Mifflin Harcourt
·       Houhgton Mifflin Harcourt
·       Houhgton MIfflin Harcourt
·       Houhton Mifflin
·       Houhton Mifflin Harcourt
·       Houhton Mifflin Harcourt Saxon
·       Houlghton Mifflin Harcourt
·       Houlgthton Mifflin Harcourt
·       Houlgton Mifflin Harcourt
·       Hourghton Mifflin Harcourt
·       Hourhton Mifflin Harcourt
·       Houston Mifflin
·       Houtgton Mifflin Harcourt
·       Houthgon Mifflin Harcourt
·       Houthton Mifflin Harcourt
·       houthton mifflin harcourt
·       Houton-Mifflin Company
·       Hpoughton Mifflin Harcourt
·       Hpughton Mifflin Harcourt
·       Hughton Mifflin Harcourt
·       Huoghton Mifflin Harcourt
·       Harcourt Mifflin Harcourt
·       Harcourt/Houghton Mifflin
·       Saxon (Houghton Mifflin Harcourt)
·       Saxon / Houghton Mifflin 9205 S. Park..
·       Saxon HMH
·       Saxon Houghton Mifflin
·       Saxon Houghton Mifflin Harcourt
·       Saxon/Houghton Mifflin 9502 Sl Park C..
·       Saxon/Houghton Mifflin Harcourt
·       Saxon-H.M.H. refer to D000030293
·       Saxon-Houghton Mifflin

This study is based upon work supported by the National Science Foundation under Grant No. 1445654 and the Smith Richardson Foundation. Any opinions, findings, and conclusions or recommendations expressed in this study are those of the author(s) and do not necessarily reflect the views of the funders.

My quick thoughts on NAEP

By now you’ve heard the headlines–NAEP scores are down almost across the board, with the exception of 4th grade reading. Many states saw drops, some of them large. Here are my thoughts.

  1. These results are quite disappointing and shouldn’t be sugar-coated. Especially in mathematics, where we’ve seen literally two decades of uninterrupted progress, it’s (frankly) shocking to see declines like this. We’ve become almost expectant of the slow-but-steady increase in NAEP scores, and this release should shake that complacency. That said, we should not forget the two decades of progress when thinking about this two-year dip (nor should we forget that we still have yawning opportunity and achievement gaps and vast swaths of the population unprepared for success in college or career).
  2. Some folks are out with screeds blaming the results on particular policies that they don’t like (test-based accountability, unions, charters, Arne Duncan, etc.). Regardless of what the actual results were, they’d have made these same points. So these folks should be ignored in favor of actual research.[1] In general, people offering misNAEPery can only be of two types: (1) people who don’t know any better, or (2) people who know better but are shameless/irresponsible. Generally I would say anyone affiliated with a university who is blaming these results on particular policies at this point is highly likely to be in the latter camp.
  3. To a large extent, what actually caused these results (Common Core? Implementation? Teacher evaluation? Waivers? The economy? Opt-out? Something else?) is irrelevant in the court of public opinion. Perception is what matters. And the perception, fueled by charlatans and naifs, will be that Common Core is to blame. I wouldn’t be surprised if these results led to renewed repeal efforts for both the standards and the assessments in a number of states, even if there is, as yet, no evidence that these policies are harmful.

Overall, it’s a sad turn of events. And what makes it all the more sad is the knowledge that the results will be used in all kinds of perverse ways to score cheap political points and make policy decisions that may or may not help kids. We can’t do anything about the scores at this point. But we can do something about the misuse of the results. So, let’s.


[1] For instance, here’s a few summaries I’ve written on testing and accountability, and here’s a nice review chapter. These all conclude, rightly, that accountability has produced meaningful positive effects on student outcomes.

Do the content and quality of state tests matter?

Over at Ahead of the Heard, Chad Aldeman has written about the recent Mathematica study, which found that PARCC and MCAS were equally predictive of early college success. He essentially argues that if all tests are equally predictive, states should just choose the cheapest bargain-basement test, content and quality be damned. He offers a list of reasons, which you’re welcome to read.

As you’d guess, I disagree with this argument. I’ll offer a list of reasons of my own here.

  1. The most obvious point is that we have reasonable evidence that testing drives instructional responses to standards. Thus, if the tests used to measure and hold folks/schools accountable are lousy and contain poor quality tasks, we’ll get poor quality instruction as well. This is why many folks are thinking these days that better tests should include tasks that are much closer to the kinds of things we want kids to actually be doing. In that case, “teaching to the test” becomes “good teaching.” May be a pipe dream, but that’s something I commonly hear.
  2. A second fairly obvious point is that switching to a completely unaligned test would end any possible notion that the tests could provide feedback to teachers about what they should be doing differently/better. Certainly we can all argue that current test results are provided too late to be useful–though smart testing vendors ought to be working on this issue as hard as possible–but if the test is in no way related to what teachers are supposed to be teaching, it’s definitely useless to them as a formative measure.
  3. Chad’s analysis seems to prioritize predictive validity–how well do results from the test predict other desired outcomes–over all the other types of validity evidence. It’s not clear to me why we should prefer predictive validity (especially when we already have evidence that GPAs do better at that than most standardized tests, though SAT/ACT adds a little) over, say, content-related validity. Don’t we first and foremost want the test to be a good measure of what students were supposed to have learned in the grade? More generally, I think it makes more sense to have different tests for different purposes, rather than piling all the purposes into a single test.
  4. Certainly if the tests are going to have stakes attached to them, the courts require a certain level of content validity (or what they’ve called instructional validity). See Debra P. v. Turlington. If a kid’s going to be held accountable, they need to have had the opportunity to learn what was on the test. If the test is the SAT, that’s probably not going to happen.

Anyway, take a look at the Mathematica report (you should anyway!) and Chad’s post and let me know what you think.

Testing my patience

PBS is out with a truly awful report on testing/opt out/Common Core. You can watch it here and read one takedown here.

I’m not going to do a full takedown, but I’ll highlight a few points that weren’t made by Will Ragland.

  1. Hagopian says testing is a multi-billion dollar industry. That’s true but overwrought and misleading. We have 50 million kids in school–spend $20 a kid per year and you’re at a billion. Yes, we spend billions on evaluating how well kids are learning. That’s far less than 1% of our total education dollars, in order to offer some evaluation of how our system is doing. Seems like a perfectly reasonable amount to me (if anything, it’s too little, and our limited spending on assessment has resulted in some of the poor quality tests we’ve seen over the years). Saving that <<1% wouldn’t really do anything to reduce class sizes or boost teacher salaries or whatever else Hagopian would like us to do, even if we cut testing expenses to 0.
  2. There’s an almost farcically absurd analogy that testing proponents think a kid with hypothermia just needs to have his temperature taken over and over again, whereas teachers just know to wrap the kid in the blanket. First of all, given horrendous outcomes for many kids, it seems like at least a handful of educators (or perhaps more accurately, the system as a whole) has neglected their blanketing duties more often than we’d care to note. Second, these test data are used in dozens of ways to help support and improve schools, especially in states that have waivers (which, admittedly, Washington is not one).
  3. Complaining about a test-and-punish philosophy in Washington State is pretty laughable, since there’s no exit exam for kids [CORRECTION: there appears to be some new exit exam requirements being rolled out in the state, though students did not opt out of these exams; apologies that I did not catch these earlier; I was referring to old data], no high-stakes teacher evaluation, and less accountability for schools than there was during the NCLB era (though parents did get a letter about their school’s performance …). Who, exactly, is being punished, and how?
  4. Finally, the report lumps together Common Core with all kinds of things that are not related to Common Core, such as the 100+ standardized test argument and the MAP test. Common Core says literally nothing at all about testing, and it certainly doesn’t have anything to do with a district-level benchmark test.

It shouldn’t be asking that much for a respected news organization to get very basic details about major education policies that have existed for 4+ year correct. Instead, we get misleading, unbalanced nonsense that will contribute to the tremendous levels of misinformation we see among voters about education policy.

Friends don’t let friends misuse NAEP data

At some point the next few weeks, the results from the 2015 administration of the National Assessment of Educational Progress (NAEP) will be released. I can all but guarantee you that the results will be misused and abused in ways that scream misNAEPery. My warning in advance is twofold. First, do not misuse these results yourself. Second, do not share or promote the misuse of these results by others who happen to agree with your policy predilections. This warning applies of course to academics, but also to policy advocates and, perhaps most importantly of all, to education journalists.

Here are some common types of misused or unhelpful NAEP analyses to look out for and avoid. I think this is pretty comprehensive, but let me know in the comments or on Twitter if I’ve forgotten anything.

  • Pre-post comparisons involving the whole nation or a handful of individual states to claim causal evidence for particular policies. This approach is used by both proponents and opponents of current reforms (including, sadly, our very own outgoing Secretary of Education). Simply put, while it’s possible to approach causal inference using NAEP data, that’s not accomplished by taking pre-post differences in a couple of states and calling it a day. You need to have sophisticated designs that look at changes in trends and levels and that attempt to poke as many holes as possible in their results before claiming a causal effect.
  • Cherry-picked analyses that focus only on certain subjects or grades rather than presenting the complete picture across subjects and grades. This is most often employed by folks with ideological agendas (using 12th grade data, typically), but it’s also used by prominent presidential candidates who want to argue their reforms worked. Simply put, if you’re going to present only some subjects and grades and not others, you need to offer a compelling rationale for why.
  • Correlational results that look at levels of NAEP scores and particular policies (e.g., states that have unions have higher NAEP scores, states that score better on some reformy charter school index have lower NAEP scores). It should be obvious why correlations of test score levels are not indicative of any kinds of causal effects given the tremendous demographic and structural differences across states that can’t be controlled in these naïve analyses.
  • Analyses that simply point to low proficiency levels on NAEP (spoiler alert: the results will show many kids are not proficient in all subjects and grades) to say that we’re a disaster zone and a) the whole system needs to be blown up or b) our recent policies clearly aren’t working.
  • (Edit, suggested by Ed Fuller) Analyses that primarily rely on percentages of students at various performance levels, instead of using the scale scores, which are readily available and provide much more information.
  • More generally, “research” that doesn’t even attempt to account for things like demographic changes in states over time (hint: these data are readily available, and analyses that account for demographic changes will almost certainly show more positive results than those that do not).

Having ruled out all of your favorite kinds of NAEP-related fun, what kind of NAEP reporting and analysis would I say is appropriate immediately after the results come out?

  • Descriptive summaries of trends in state average NAEP scores, not just across a two NAEP waves but across multiple waves, grades, and subjects. These might be used to generate hypotheses for future investigation but should not (ever (no really, never)) be used naively to claim some policies work and others don’t.
  • Analyses that look at trends for different subgroups and the narrowing or closing of gaps (while noting that some of the category definitions change over time).
  • Analyses that specifically point out that it’s probably too early to examine the impact of particular policies we’d like to evaluate and that even if we could, it’s more complicated than taking 2015 scores and subtracting 2013 scores and calling it a day.

The long and the short of it is that any stories that come out in the weeks after NAEP scores are released should be, at best, tentative and hypothesis-generating (as opposed to definitive and causal effect-claiming). And smart people should know better than to promote inappropriate uses of these data, because folks have been writing about this kind of misuse for quite a while now.

Rather, the kind of NAEP analysis that we should be promoting is the kind that’s carefully done, that’s vetted by researchers, and that’s designed in a way that brings us much closer to the causal inferences we all want to make. It’s my hope that our work in the C-SAIL center will be of this type. But you can bet our results won’t be out the day the NAEP scores hit. That kind of thoughtful research designed to inform rather than mislead takes more than a day to put together (but hopefully not so much time that the results cannot inform subsequent policy decisions). It’s a delicate balance, for sure. But everyone’s goal, first and foremost, should be to get the answer right.

Any way you slice it, PDK’s results on Common Core are odd

I’ve written previously about recent polling on Common Core, noting that PDK/Gallup’s recent poll result on that topic is way out of whack with what other polls have found. One common argument you hear to explain this result is that PDK has a different wording than other polls. I always found this argument a little suspect, because I doubted that such hugely disparate results could be explained by the PDK wording (which, to me, seems relatively neutral).

In the 2015 PACE/USC Rossier poll, we designed questions to test the impact of Common Core poll wording on respondents’ views toward the standards. Specifically, we split our sample of 2400 four ways, randomly assigning each group one of four Common Core questions.

  1. To what extent do you approve or disapprove of the Common Core State Standards? (neutral wording)
  2. To what extent do you support or oppose having teachers in your community use the Common Core State Standards to guide what they teach? (PDK)
  3. As you may know, over the past few years states have been deciding whether or not to implement the Common Core State Standards, which are national standards for reading, writing, and math. In the states that have these standards, they will be used to hold public schools accountable for their performance. To what extent do you support or oppose the use of the Common Core Standards in California? (Education Next)
  4. A version of a PACE/USC Rossier legacy question that provides a pro- and an anti- CCSS explanation and asks respondents to pick one.

This design allows us to explicitly compare the results from wordings used in multiple national polls, and it also allows us to compare California-specific results to national figures. So, what did we learn?

First, we learned that the wording of Education Next and PDK did indeed affect the results they obtained. Using the Education Next wording, we saw support leading opposition 52/29. In contrast, using both the neutral wording (26/31) and PDK (24/27) wordings, we saw support trailing opposition [1]. Clearly, how you word the question affects what results you get.

But second, we saw that the PDK results almost certainly cannot be entirely explained by question wording. To see how we reached this conclusion, consider the difference between the support we observed using the Education Next question and the results they saw: 52/29 vs. 49/35. Those results are quite close–just a few points difference on both support and opposition–and the difference is likely attributable to the fact that California voters are more liberal than national averages and the state has seen less Common Core controversy than some others.

In contrast, our results using the PDK wording are wildly different from the results PDK reported: 24/27 vs. 24/54. Those results are substantially different in two main ways. First, many more people offered a response to this question on the PDK poll than on our poll, suggesting more people feel informed enough to opine in their sample (probably marginal people who know quite little about the topic). Second, while the proportion supporting is the same, the proportion opposing is twice as high (!) in the PDK poll sample.

How could it be that our results differed from EdNext’s by just a few points but differed from PDK’s by 27 points? I think these results suggest that question wording alone cannot fully explain these differences. So what are the possible explanations? I see two most likely:

First, it’s possible there’s something wrong with PDK’s sample or pollster. Though Gallup has a strong national reputation, they’ve been criticized in the past by some notable polling experts. It could be that those problems are occurring here, too.

Second, there’s something about the ordering of questions that’s affecting support on the PDK poll. In particular, PDK asked 9 questions about standardized tests before they got to the Common Core question (at least, to the extent that I can discern their ordering from their released documentation). In contrast, we asked neutral right track/wrong track questions about the governor, the president, and California schools, and Education Next asked about support for schools, topics covered in schools, and school spending. Perhaps that ordering had something to do with the results.

Either way, I think these results add further support to the conclusion that PDK’s results (certainly on Common Core, but probably in general) shouldn’t be taken as the gospel. Quite the contrary; they’re an outlier, and their results should be treated as such until they demonstrate findings more in line with what we know about public opinion.


[1] I wasn’t expecting PDK to be as close to the neutral result as they were.