New evidence that textbooks matter

It’s been six months since I’ve written here. My apologies. In the meantime I’ve written a few pieces elsewhere, such as:

  • Here and here on the problems of “percent proficient” as a measure of school performance. The feds seem to have listened to our open letter, as they are allowing states to use performance indices (and perhaps some transformation of scale scores, though there seems to be disagreement on this point) in school accountability.
  • Here and here on public opinion on education policy and an agenda for the incoming administration (admittedly, written when I thought the incoming administration would be somewhat different than the one that’s shaping up).
  • Here describing just how “common” Common Core states’ standards are.
  • Here discussing challenges with state testing and a path forward.

The main project on which I continue to work, however, is the textbook research. We are out with our first working paper (a version of which was just recently accepted for publication in AERA Open), and a corresponding brief through Brookings’ Evidence Speaks series (on which I am now a contributor).

You should check out the brief and the paper, but the short version of the findings is that we once again identify one textbook–Houghton Mifflin California Math–as producing larger achievement gains than the other most commonly adopted textbooks in California during the period 2008-2013. These gains are in the range .05 to .10 standard deviations, and they persist across multiple grades and years (ours is the longest study we are aware of on this topic). The gains may seem modest, but it is important to remember that they accrue to all students in these grades. Thus, for another policy that focuses only on low-achieving students to achieve the same total achievement effect, the impact would have to be much larger. And of course, as we’ve written elsewhere, the marginal cost of choosing this particular textbook over any other is close to zero (though we actually could not find price lists for the books under study, we know this to be true).

We are excited to have the paper out there after years (literally) of work just pulling the data together. I also presented the results in Sacramento and am optimistic that states may start to listen to the steadily growing drumbeat on the importance of collecting and analyzing data on textbook adoptions.




A letter to the U.S. Department of Education (final signatory list)

This is the final version of the letter, which I submitted today.


July 22, 2016


The Honorable John King

Secretary of the Education Department

400 Maryland Avenue, SW

Washington, D.C. 20202


Dear Mr. Secretary:

The Every Student Succeeds Act (ESSA) marks a great opportunity for states to advance accountability systems beyond those from the No Child Left Behind (NCLB) era. The Act (Section 1111(c)(4)(B)(i)(I)) requires states to use an indicator of academic achievement that “measures proficiency on the statewide assessments in reading/language arts and mathematics.” The proposed rulemaking (§ 200.14) would clarify this statutory provision to say that the academic achievement indicator must “equally measure grade-level proficiency on the reading/language arts and mathematics assessments.”

We write this letter to argue that the Department of Education should not mandate the use of proficiency rates as a metric of school performance under ESSA. That is, states should not be limited to measuring academic achievement using performance metrics that focus only on the proportion of students who are grade-level proficient—rather, they should be encouraged, or at a minimum allowed, to use performance metrics that account for student achievement at all levels, provided the state defines what performance level represents grade level proficiency on its reading/language arts and mathematics assessments.

Moving beyond proficiency rates as the sole or primary measure of school performance has many advantages. For example, a narrow focus on proficiency rates incentivizes schools to focus on those students near the proficiency cut score, while an approach that takes into account all levels of performance incentivizes a focus on all students. Furthermore, measuring performance using the full range of achievement provides additional and useful information for parents, practitioners, researchers, and policymakers for the purposes of decisionmaking and accountability, including more accurate information about the differences among schools.

Reporting performance in terms of the percentage above proficient is problematic in several important ways. Percent proficient:

  1. Incentivizes schools to focus only on students around the proficiency cutoff rather than all students in a school (Booher-Jennings, 2005; Neal & Schanzenbach, 2010). This can divert resources from students who are at lower or higher points in the achievement distribution, some of whom may need as much or more support than students just around the proficiency cut score (Schwartz, Hamilton, Stecher, & Steele, 2011). This has been shown to influence which students in a state benefit (i.e., experience gains in their academic achievement) from accountability regulations (Neal & Schanzenbach, 2010).
  2. Encourages teachers to focus on bringing students to a minimum level of proficiency rather than continuing to advance student learning to higher levels of performance beyond proficiency.
  3. Is not a reliable measure of school performance. For example, percent proficient is an inappropriate measure of progress over time because changes in proficiency rates are unstable and measured with error (Ho, 2008; Linn, 2003; Kane & Staiger, 2002). The percent proficient is also dependent upon the state-determined cut score for proficiency on annual assessments (Ho, 2008), which varies from state to state and over time. Percent proficient further depends on details of the testing program that shouldn’t matter, such as the composition of the items on the state test or the type of method used to set performance standards. These problems are compounded in small schools or in subgroups that are small in size.
  4. Is a very poor measure of performance gaps between subgroups, because percent proficient will be affected by how a proficiency cut score on the state assessments is chosen (Ho, 2008; Holland, 2002). Indeed, prior research suggests that using percent proficient can even reverse the sign of changes in achievement gaps over time relative to if a more accurate method is used (Linn, 2007).
  5. Penalizes schools that serve larger proportions of low-achieving students (Kober & Riddle, 2012) as schools are not given credit for improvements in performance other than the move to proficiency from not-proficient.

We suggest two practices for measuring achievement that lessen or avoid these problems. Importantly, some of these practices were utilized by states in ESEA Flexibility Waivers and are improvements to NCLB practices (Polikoff, McEachin, Wrabel, & Duque, 2014).

Average Scale Scores

The best approach for measuring student achievement levels for accountability purposes under ESSA is to use average scale scores. Rather than presenting performance as the proportion of students who have met the minimum-proficiency cut score, states could present the average (mean) score of students within the school and the average performance of each subgroup of students. If the Department believes percent proficient is also important for reporting purposes, these values could be reported alongside the average scale scores.

The use of mean scores places the focus on improving the academic achievement of all students within a school and not just those whose performance is around the state proficiency cut score (Center for Education Policy, 2011). Such a practice also increases the amount of variation in school performance measures each year, providing for improved differentiation between schools that may have otherwise similar proficiency rates. In fact Ho (2008) argues if a single rating is going to be used for reporting on performance, it should be a measure of the average performance because such measures incorporate the value of every score (student) into the calculation and the average can be used for more advanced analyses. The measurement of gaps between key demographic groups of students, a key goal of the ESSA law, is dramatically improved with the use of average scores rather than the proportion of proficient students (Holland, 2002; Linn, 2007).

Proficiency Indexes

If average scale scores cannot be used, a weaker alternative that is still superior to percent proficient would be to allow states to use proficiency indexes. Schools under this policy would be allocated points based on multiple levels of performance. For example, a state could identify four levels of performance on annual assessments: Well Below Proficient, Below Proficient, Proficient, and Advanced Proficient. Schools receive no credit for students Well Below Proficient, partial credit for students who are Below Proficient, full credit for students reaching Proficiency, and additional credit for students reaching Advanced Proficiency. Here we present an example using School A and School B.

Proficiency Index Example
School A School B
Proficiency Category (A)
Points Per Student
# of Students
Index Points
Points Per Student
# of Students
Index Points
Well Below Proficient 0.0 27 0.0 0.0 18 0.0
Below Proficient 0.5 18 9.0 0.5 27 13.5
Proficient 1.0 33 33.0 1.0 26 26.0
Advanced Proficient 1.5 22 33.0 1.5 29 43.5
Total 100 75.0 100 83.0
NCLB Proficiency Rate: 55%
ESSA Proficiency Index: 75
NCLB Proficiency Rate: 55%
ESSA Proficiency Index: 83

Under NCLB proficiency rate regulations, both School A and School B would have received a 55% proficiency rate score. Using a Proficiency Index, the performance of these schools would no longer be identical. A state would be able to compare the two schools while simultaneously identifying annual meaningful differentiation in the performance of School A from that of School B. The hypothetical case presented here is not the only way a proficiency index can be used. Massachusetts is one example of a state that has used a proficiency index for the purposes of identifying low-performing schools and gaps between subgroup of students (see: ESEA Flexibility Request: Massachusetts, page 32). These indexes are understandable for practitioners, family members, and administrators while also providing additional information regarding the performance of students who are not grade-level proficient.

The benefits of using such an index, relative to using the proportion of proficient students in a school, is that it incentivizes a focus on all students, not just those around an assessment’s proficiency cut score (Linn, Baker, & Betebenner, 2002). Moreover, schools with large proportions of students well-below the proficiency cut score are given credit for moving students to higher levels of performance even if still below the cut score (Linn, 2003). The use of a proficiency index or providing schools credit for students at different points in the achievement distribution improves the construct validity of the accountability measures over the NCLB proficiency rate measures (Polikoff et al., 2014). In other words, the inferences made about schools (e.g., low-performing or bottom 5%) using the proposed measures are more appropriate than those made using proficiency rates alone.

What We Recommend

Given the findings cited above, we believe the Department of Education should revise its regulations to one of two positions:

  • Explicitly endorsing or encouraging states to use one of the two above-mentioned approaches as an alternative to proficiency rates as the primary measure of school performance. Average scale scores is the superior method.
  • Failing that, clarifying that the law is neutral about the use of proficiency rates versus one of the two above-mentioned alternatives to proficiency rates as the primary measure of school performance.

With the preponderance of evidence showing that schools and teachers respond to incentives embedded in accountability systems, we believe option 1 is the best choice. This option leaves states the authority to determine school performance how they see fit but encourages them to incorporate what we have learned through research about the most accurate and appropriate way to measure school performance levels.

Our Recommendation is Consistent with ESSA

Section 1111(c)(4)(A)) of ESEA, as amended by ESSA, requires each state to establish long-term goals:

“(i) for all students and separately for each sub- group of students in the State—

(I) for, at a minimum, improved—

(aa) academic achievement, as measured by proficiency on the annual assessments required under subsection (b)(2)(B)(v)(I);”

And Section 1111(c)(4)(B) of ESEA requires the State accountability system to have indicators that are used to differentiate all public schools in the State, including—(i) “academic achievement—(I) as measured by proficiency on the annual assessments required [under other provisions of ESSA].”

Our suggested approach is supportable under these provisions based on the following analysis. The above-quoted provisions in the law that mandate long-term goals and indictors of student achievement based on proficiency on annual assessments do not prescribe how a state specifically uses the concept of proficient performance on the state assessments. The statute does not prescribe that “proficiency” be interpreted to compel differentiation of schools based exclusively on “proficiency rates.” Proficiency is commonly taken to mean “knowledge” or “skill” (Merriam Webster defines it as “advancement in knowledge or skill” or “the quality or state of being proficient”, where “proficient” is defined as “well advanced in an art, occupation, or branch of knowledge”). Under either of these definitions, an aggregate performance measure such as the two options described above would clearly qualify as involving a measure of proficiency. Both of the above-mentioned options provide more information about the average proficiency level of a school than an aggregate proficiency rate. Moreover, they address far more effectively than proficiency rates the core purposes of ESSA, including incentivizing more effective efforts to educate all children and providing broad discretion to states in designing their accountability systems.

We would be happy to provide more information on these recommendations at your pleasure.


Morgan Polikoff, Ph.D., Associate Professor of Education, USC Rossier School of Education


Educational Researchers and Experts

Alice Huguet, Ph.D., Postdoctoral Fellow, School of Education and Social Policy, Northwestern University

Andrew Ho, Ph.D., Professor of Education, Harvard Graduate School of Education

Andrew Saultz, Ph.D., Assistant Professor, Miami University (Ohio)

Andrew Schaper, Ph.D., Senior Associate, Basis Policy Research

Anna Egalite, Ph.D., Assistant Professor of Education, North Carolina State University

Arie van der Ploeg, Ph.D., retired Principal Researcher, American Institutes for Research

Cara Jackson, Ph.D., Assistant Director of Research & Evaluation, Urban Teachers

Christopher A. Candelaria, Ph.D., Assistant Professor of Public Policy and Education, Vanderbilt University

Cory Koedel, Ph.D., Associate Professor of Economics and Public Policy, University of Missouri

Dan Goldhaber, Ph. D., Director, Center for Education Data & Research, University of Washington Bothell

Danielle Dennis, Ph.D., Associate Professor of Literacy Studies, University of South Florida

Daniel Koretz, Ph.D., Henry Lee Shattuck Professor of Education, Harvard Graduate School of Education

David Hersh, Ph.D. Candidate, Rutgers University Bloustein School of Planning and Public Policy

David M. Rochman, Research and Program Analyst, Moose Analytics

Edward J. Fuller, Ph.D., Associate Professor of Education Policy, The Pennsylvania State University

Eric A. Houck, Associate Professor of Educational Leadership and Policy, University of North Carolina at Chapel Hill

Eric Parsons, Ph.D., Assistant Research Professor, University of Missouri

Erin O’Hara, former Assistant Commissioner for Data & Research, Tennessee Department of Education

Ethan Hutt, Ph.D., Assistant Professor of Education, University of Maryland College Park

Eva Baker, Ed.D., Distinguished Research Professor, UCLA Graduate School of Education and Information Studies, Director, Center for Research on Evaluation, Standards, and Student Testing, Past President, American Educational Research Association

Greg Palardy, Ph.D., Associate Professor, University of California, Riverside

Heather J. Hough, Ph.D., Executive Director, CORE-PACE Research Partnership

Jason A. Grissom, Ph.D., Associate Professor of Public Policy and Education, Vanderbilt University

Jeffrey Nellhaus, Ed.M., Chief of Assessment, Parcc Inc., former Deputy Commissioner, Massachusetts Department of Elementary and Secondary Education

Jeffrey W. Snyder, Ph.D., Assistant Professor, Cleveland State University

Jennifer Vranek, Founding Partner, Education First

John A. Epstein, Ed.D., Education Associate Mathematics, Delaware Department of Education

John Q. Easton, Ph.D., Vice President, Programs, Spencer Foundation, former Director, Institute of Education Sciences

John Ritzler, Ph.D., Executive Director, Research & Evaluation Services, South Bend Community School Corporation

Jonathan Plucker, Ph.D., Julian C. Stanley Professor of Talent Development, Johns Hopkins University

Joshua Cowen, Ph.D., Associate Professor of Education Policy, Michigan State University

Katherine Glenn-Applegate, Ph.D., Assistant Professor of Education, Ohio Wesleyan University

Linda Darling-Hammond, Ed.D., President, Learning Policy Institute, Charles E. Ducommun Professor of Education Emeritus, Stanford University, Past President, American Educational Research Association

Lindsay Bell Weixler, Ph.D., Senior Research Fellow, Education Research Alliance for New Orleans

Madeline Mavrogordato, Ph.D., Assistant Professor, K-12 Educational Administration, Michigan State University

Martin R. West, Ph.D., Associate Professor, Harvard Graduate School of Education

Matt Chingos, Ph.D., Senior Fellow, Urban Institute

Matthew Di Carlo, Ph.D., Senior Fellow, Albert Shanker Institute

Matthew Duque, Ph.D., Data Strategist, Baltimore County Public Schools

Matthew A. Kraft, Ed.D., Assistant Professor of Education and Economics, Brown University

Michael H. Little, Royster Fellow and Doctoral Student, University of North Carolina at Chapel Hill

Michael Hansen, Ph.D., Senior Fellow and Director, Brown Center on Education Policy, Brookings Institution

Michael J. Petrilli, President, Thomas B. Fordham Institute

Nathan Trenholm, Director of Accountability and Research, Clark County (NV) School District

  1. Tiên Lê, Doctoral Fellow, USC Rossier School of Education

Raegen T. Miller, Ed.D., Research Fellow, Georgetown University

Russell Brown, Ph.D., Chief Accountability Officer, Baltimore County Public Schools

Russell Clement, Ph.D., Research Specialist, Broward County Public Schools

Sarah Reckhow, Ph.D., Assistant Professor of Political Science, Michigan State University

Sean P. “Jack” Buckley, Ph.D., Senior Vice President, Research, The College Board, former Commissioner of the National Center for Education Statistics

Sherman Dorn, Ph.D., Professor, Mary Lou Fulton Teachers College, Arizona State University

Stephani L. Wrabel, Ph.D., USC Rossier School of Education

Thomas Toch, Georgetown University

Tom Loveless, Ph.D., Non-resident Senior Fellow, Brookings Institution


K-12 Educators

Alexander McNaughton, History Teacher, YES Prep Charter School, Houston, TX

Andrea Wood Reynolds, District Testing Coordinator, Northside ISD, TX

Angela Atkinson Duina, Ed.D., Title I School Improvement Coordinator, Portland Public Schools, ME

Ashley Baquero, J.D., English/Language Arts Teacher, Durham, NC

Brett Coffman, Ed.S., Assistant Principal, Liberty High School, MO

Callie Lowenstein, Bilingual Teacher, Washington Heights Expeditionary Learning School, NY

Candace Burckhardt, Special Education Coordinator, Indigo Education

Daniel Gohl, Chief Academic Officer, Broward County Public Schools, FL

Danielle Blue, M.Ed., Director of Preschool Programming, South Kingstown Parks and Recreation, RI

Jacquline D. Price, M.Ed., County School Superintendent, La Paz County, AZ

Jennifer Taubenheim, Elementary Special Education Teacher, Idaho Falls, ID

Jillian Haring, Staff Assistant, Broward County Public Schools, FL

Juan Gomez, Middle School Math Instructional Coach Carmel High School, Carmel, CA

Mahnaz R. Charania, Ph.D., GA

Mary F. Johnson, MLS, Ed.D., Retired school librarian

MaryEllen Falvey, M.Ed, NBCT, Office of Academics, Broward County Public Schools, FL

Meredith Heikes, 6th grade STEM teacher, Quincy School District, WA

Mike Musialowski, M.S., Math/Science Teacher, Taos, NM

Misty Pier, Special Education Teacher, Eagle Mountain Saginaw ISD, TX

Nell L. Forgacs, Ed.M., Educator, MA

Oscar Garcia, Social Studies Teacher, El Paso Academy East, TX

Patricia K. Hadley, Elementary School Teacher, Retired, Twin Falls, ID

Samantha Arce, Elementary Teacher, Phoenix, AZ

Theodore A. Hadley, High School/Middle School Teacher, Retired, Twin Falls, ID
Tim Larrabee, M.Ed., MAT, Upper Elementary Teacher, American International School of Utah

Troy Frystak, 5/6 Teacher, Springwater Environmental Sciences School, OR


Other Interested Parties

Arnold F. Shober, Ph.D., Associate Professor of Government, Lawrence University

Celine Coggins, Ph. D., Founder and CEO, Teach Plus

David Weingartner, Co-Chair Minneapolis Public Schools 2020 Advisory Committee

Joanne Weiss, former chief of staff to U.S. Secretary of Education Arne Duncan

Justin Reich, EdD, Executive Director, Teaching Systems Lab, MIT

Karl Rectanus, CEO, Lea(R)n, Inc.

Kenneth R. DeNisco, Ph.D., Associate Professor, Physics & Astronomy, Harrisburg Area Community College

Kimberly L. Glass, Ph.D., Pediatric Neuropsychologist, The Stixrud Group

Mark Otter, COO, VIF International Education

Patrick Dunn, Ph.D., Biomedical Research Curator, Northrop Grumman TS

Robert Rothman, Education Writer, Washington, DC

Steven Gorman, Ph.D., Program Manager, Academy for Lifelong Learning, LSC-Montgomery

Torrance Robinson, CEO, trovvit


Booher-Jennings, J. (2005). Below the bubble: “Educational triage” and the Texas accountability system. American Educational Research Journal, 42(1), 231–268.

Center on Education Policy. (2011, May 3). An open letter from the Center on Education Policy to the SMARTER Balanced Assessment Consortium and the Partnership for Assessment of Readiness for College and Career. Retrieved from

Ho, A. D. (2008). The problem with “proficiency”: Limitations of statistics and policy under No Child Left Behind. Educational Researcher, 37(6), 351–360.

Holland, P. W. (2002). Two measures of change in the gaps between the CDFs of test-score distributions. Journal of Educational Behavioral Statistics, 27(1), 3–17.

Kober, N., & Riddle, W. (2012). Accountability issues to watch under NCLB waivers. Washington, DC: Center on Education Policy.

Linn, R. L. (2003). Accountability: Responsibility and reasonable expectations. Educational Researcher, 32(7), 3–13.

Linn, R. L. (2007). Educational accountability systems. Paper presented at the The CRESST Conference: The Future of Test-Based Educational Accountability.

Linn, R. L., Baker, E. L., & Betebenner, D. W. (2002). Accountability systems: Implications of requirements of the No Child Left Behind Act of 2001. Educational Researcher, 31(6), 3–16.

Neal, D., & Schanzenbach, D. W. (2010). Left behind by design: Proficiency counts and test-based accountability. Review of Economics and Statistics, 92, 263–283.

Ng, H. L., & Koretz, D. (2015). Sensitivity of school-performance ratings to scaling decisions. Applied Measurement in Education, 28(4), 330–349.

Polikoff, M. S., McEachin, A., Wrabel, S. L., & Duque, M. (2014). The waive of the future? School accountability in the waiver era. Educational Researcher, 43(1), 45–54.

Schwartz, H. L., Hamilton, L. S., Stecher, B. M., & Steele, J. L. (2011). Expanded measures of school performance. Santa Monica, CA: The RAND Corporation.






More evidence that the test matters

Well, it’s been two months since my last post. In those two months, a lot has happened. I’ve continued digging into the textbook adoption data (this was covered on an EdWeek blog and I also wrote about it for Brookings). Fordham also released their study of the content and quality of next-generation assessments, on which I was a co-author (see my parting thoughts here). Finally, just last week I was granted tenure at USC. So I’ve been busy and haven’t written here as much as I should.

Today I’m writing about a new article of mine that’s just coming out in Educational Assessment (if you want a copy, shoot me an email). This is the last article I’ll write using the Measures of Effective Teaching data (I previously wrote here and here using these data). This paper asks a very simple question: looking across the states in the MET sample, is there evidence that the correlations of observational and student survey measures with teacher value-added vary systematically? In other words, are the tests used in these states differentially sensitive to these measures of instructional quality?

This is an important question for many reasons. Most obviously, we are using both value-added scores and instructional quality measures (observations, surveys) for an increasingly wide array of decisions, both high- and low-stakes. For any kind of decision we want to make, we want to be able to confidently say that the assessments used for value-added are sensitive to the kinds of instructional practices we think of as being “high quality.” Otherwise, for instance, it is hard to imagine how teachers could be expected to improve their value-added through professional development opportunities (i.e., if no observed instructional measures predict value-added, how can we expect teachers to improve their value added?). The work is also important because, to the extent that we see a great deal of variation across states/tests in sensitivity to instruction, it may necessitate greater attention to the assessments themselves in both research and policy [1]. As I argue in the paper, the MET data are very well suited to this kind of analysis, because there were no stakes (and thus limited potential for gaming).

The methods for investigating the question are very straightforward–basically I just correlate or regress value-added estimates from the MET study on teacher observation scores and student survey scores separately by state. Where I find limited or no evidence of relationships, I dig in further by doing things like pulling out outliers, exploring nonlinear relationships, and determining relationships at the subscale or grade level.

What I find, and how that should be interpreted, probably depends on where you sit. I do find at least some correlations of value-added with observations and student surveys in each state and subject. However, there is a good deal of state-to-state variation. For instance, in some states, student surveys correlate with value-added as high as .28 [2], while in other states those correlations are negative (though not significantly different from zero).

Analyzing results at the subscale level–where observational and survey scores are probably most likely to be useful–does not help. Perhaps because subscales are much less reliable than total scores, there are very few statistically significant correlations of subscales with VAM scores, and these too differ by state. If this pattern were to hold in new teacher evaluation systems being implemented in the states, it would raise perplexing questions about what kinds of instruction these value-added scores were sensitive to.

Perhaps the worst offender in my data is state 4 in English language arts (I cannot name states due to data restrictions). For this state, there are no total score correlations of student surveys or any of the observational measures with teacher value-added. There is one statistically significant correlation at a single grade level, and there is also one statistically significant correlation for a single subscale on one observational instrument. But otherwise, the state ELA tests in this state seem to be totally insensitive to instructional quality as measured by the Framework for Teaching, the CLASS, and the ELA-specific PLATO (not to mention the Tripod student survey). Certainly it’s possible these tests could be sensitive to some other measures not included in MET, but it’s not obvious to me what those would be (nor is it obvious that new state systems will be implemented as carefully as MET was).

I conclude with extended implications for research and practice. I think this kind of work raises a number of questions, such as:

  1. What is it about the content of these tests that makes some sensitive and others not?
  2. What kind of instruction do we want our tests to be sensitive to?
  3. How sensitive is “sensitive enough?” That is, how big a correlation do we want or need between value-added and instructional measures?
  4. If we want to provide useful feedback to teachers, we need reliable subscores on observational measures. How can we achieve that?

I enjoyed writing this article, and I believe it may well be my longest-term paper from beginning to submission. I hope you find it useful and that it raises additional questions about teacher evaluation moving forward. And I welcome your reactions (though I’m done with MET data, so if you want more analysis, I’m not your man)!


[1] The oversimplified but not-too-far-off summary of most value-added research is that it is almost completely agnostic to the test that’s used to calculate the VAM.

[2] I did not correct the correlations for measurement error, in contrast to the main MET reports.

An awful lot of districts don’t know what textbooks are used in their schools

That’s one of many takeaways of my textbook research so far. I guess to many people this is no surprise, but it seems crazy to me. Knowledge of what is going on inside schools strikes me as the most basic function of the district office. And yet I would estimate around 10% of the districts that have responded to my FOIA requests have said they have no documents listing the textbooks in use, and probably another 30-50% clearly have to invent such a document to satisfy my request [1]. Instead, I get a lot of letters like this:

Thank you for using the [district name] FOIA Center.

The FOIA office has been advised by the appropriate departments that the records you seek are not kept in the normal course of business. That is, a full and complete list of all mathematics and science textbooks currently in use by grade and the year the textbook was first used. As written, this request is categorical and unduly burdensome in nature and would require extensive resources to both search for information, which would most likely require a manual school by school search, and analysis to determine the other data points you are seeking. For these reasons, [district] is denying this request pursuant to [state statute] and invites you to narrow your request to manageable proportions. If [district] does not receive a revised request from you within five (5) business days of this response, this request will be closed.

Apparently to many folks this kind of arrangement is just fine–school sites should be able to decide all this stuff themselves. I can buy the argument that schools should have autonomy over curriculum materials (though I doubt that’s very efficient or good for kids), but even if you believe that’s the case, shouldn’t the district at least track how their money is being spent?

This is one of the research questions that’s emerged over time as I’ve gone through this textbook project, and it’s something I’ll investigate just as soon as I finish this round of FOIAs. My hypothesis? I suspect Ilana Horn is right about the consequences of this kind of non-leadership by districts:

I hope we’re wrong, but I doubt it.


[1] Districts don’t actually have to do this under the letter of FOIA law. So I very much appreciate the efforts.

This study is based upon work supported by the National Science Foundation under Grant No. 1445654 and the Smith Richardson Foundation. Any opinions, findings, and conclusions or recommendations expressed in this study are those of the author(s) and do not necessarily reflect the views of the funders.

Recruiting teachers!


I’m looking to recruit a few teachers (9, specifically) to participate in a study to test survey measures of teachers’ instruction for use in a large national study of standards implementation.
Teachers who participate in the work will be asked to do three things:
  1. Complete a bi-weekly (every other week) log survey describing their instruction in either mathematics or ELA over the course of the spring semester in a target class. The first log will be in either the last week of January or the first week of February. All logs will be online. We expect the first log will take 45 minutes to an hour, but that teachers will get more efficient at completing the logs as they go through the study.
  2. Complete a year-end online survey asking similar questions to the weekly log. This will be done at the end of the school year, likely in June.
  3. Turn in 2 weeks work of (blank) assignments and assessments that are given to students in the target class at any point during the spring semester.
For participating in this activity, we will provide each teacher with a $325 e-gift card. Please note that all work should be done outside teachers’ regular work hours.
The only eligibility requirements are:
  1. Teach a mathematics or ELA class in grades K-12.
  2. Work in a public (including charter) or private school that does not require separate research approval.
If you are interested in participating, please email me at my last name at Please share widely!

The Reports of Common Core’s Death Have Been Greatly Exaggerated

This evening, I happened upon an article from the Associated Press, noting that West Virginia’s State Board of Education had repealed Common Core in the state. (Note: Common Core had already been renamed the K-12 Next Generation Standards in the state.) The new standards, called the West Virginia College- and Career-Readiness Standards, are available here: (ELA and mathematics). Another article presents the major changes as follows (with my snark in bold):

· Simplify the presentation of standards for teachers and parents (I guess sequential numbering is more simplified…?)

· Increase prevalence of problem-solving skills with a connection to college, careers and life-needed skills

· Align standards for more grade level appropriateness for all standards at all grade levels (No clue what this refers to… maybe the insertion of “with prompting and support” in a few K ELA standards?)

· Include clarifying examples within each standard to make them more relevant to learning (Most already had examples. A few standards do now have additional “instructional notes”)

· Include an introduction of foundational skills in ELA and mathematics to ensure mastery of content in future grade levels

· Include handwriting in grades K-4, and explicit mention of cursive writing instruction in grades 2-3 (Handwriting is great! Mandatory cursive remains an absurd policy.)

· Include an explicit mention for students to learn multiplication (times) tables by the end of grade 3

· Add standards specific to Calculus with the expectation of Calculus being available to all students (Yeah, no one is taking calculus in high school since Common Core.)

Increased emphasis on handwriting is indeed an addition, as are cursive and calculus. These are changes that other states have made too. Adding multiplication tables is not an addition to Common Core (I don’t know where this myth came from that Common Core doesn’t require multiplication facts by the end of 3rd grade: “By the end of Grade 3, know from memory all products of two one-digit numbers.“)

But if you actually go and read the new standards, they are almost verbatim the same as Common Core in most cases. In 3rd grade math, aside from the addition of “speed” to the requirement for fluency with times tables, the standards are Common Core (with two exceptions that I saw: West Virginia’s new standards sometimes add clarifying instructional notes, and West Virginia’s new standards replace the words “for example” with “e.g.”). Oh, and the standards have been renumbered, thus making crosswalks with textbooks or websites more complicated.

I know, as someone who actually likes Common Core and wants it to stick around, that I probably shouldn’t even be writing about this. I should probably sit quietly while the state attempts to pull a fast one on its populace. But this is *so* dumb that I felt obliged to say something.

It’s *so* dumb to waste even one cent of taxpayer money on Common Core commissions in state after state, each resulting in virtually identical standards to the much-loathed Common Core.

It’s *so* dumb to keep the same standards but renumber them, making things needlessly more complicated for teachers and providing absolutely no benefit.

It’s *so* dumb to rename the standards twice but leave the content unchanged, all in an to attempt to fool the hysterical masses.

It’s *so* dumb to report on these kinds of changes as if they are “repeals” when they are nothing of the sort.

Rather than doing dumb things like these, here are some suggestions for how these situations might be better handled (admittedly these are probably naïve, because I’m blessed not to have to deal with crazy people for my livelihood):

  • If your citizens believe nonsensical things about Common Core that aren’t true, you should correct their misunderstandings. You should not feed those nonsensical beliefs for political gain [1].
  • If you think the standards are good enough to keep almost verbatim, then defend the standards rather than running from them. 
  • If you don’t think the standards are good enough to keep, then don’t keep them! Get smart people together and do a legitimate rewrite.
  • If you leave the standards open for public comment for months and you get virtually no comments based on any discernible evidence, the standards are probably pretty good.
  • If your citizens are so gullible that they will fall for such transparently obvious ploys, you’ve got problems with the gullibility of your citizenry (which might be mitigated with better standards and instruction).

So, West Virginia’s kids will still be learning Common Core standards come 2016. They’ll also be learning cursive (and a few of them will be learning calculus (which they would have anyway, because obviously)). And what the citizens of the state will be learning–or would be if they paid any attention to what’s happening–is that their government prefers to lie to them for the sake of appeasement than it does to defend its policies as to what’s best for the state. For many reasons, that’s the wrong kind of lesson to be teaching.


[1] The first part of this sentence applies mostly to the right. The second part applies to both extremes of the political spectrum.

My visit to Success Academies

On Wednesday I had the pleasure of visiting Success Academy Harlem 1 and hearing from Eva Moskowitz and the SA staff about their model. I’m not going to venture into the thorny stuff about SA here. What I will say is that their results on state tests are clearly impressive, and that I doubt that they’re fully (or even largely) explained by the practices that cause controversy (and luckily we’ll soon have excellent empirical evidence to answer that question).

Instead, what I’m going to talk about briefly is the fascinating details I saw and heard about curriculum and instruction in SA schools. Of course right now it is impossible to know what’s driving their performance, but these are some of the things I think are likely to contribute. [EDIT: I’d forgotten Charles Sahm wrote many of these same things in a post this summer. His is more detailed and based on more visits than mine. Read it!]

What I saw in my tour of about a half-dozen classrooms at SA 1:

  • The first thing I observed in each classroom is the intense focus on student discourse and explanation. In each classroom, students are constantly pressed to explain their reasoning, and other students respond constructively and thoughtfully to the arguments of their peers. This “pressing for mastery” is one of the key elements of the SA vision of excellence, as I later learned.
  • Students are incredibly organized and on-task. They sit quietly while others are speaking and then, when prompted by the teachers to begin discussion in pairs, they immediately turn and address the question at hand. I saw virtually no goofing off or inattention in the classes I observed. This includes in a Pre-K classroom. To facilitate the structure and organization I saw lots of timers–everything was timed, starting and stopping in the exact amount of time indicated by the teacher.
  • The actual math content I observed being taught was clearly on-grade according to Common Core. In a third grade classroom I saw students working on conceptual explanations of fraction equivalence for simple cases (2/3 = 4/6); this comes right out of the third grade standards. I later learned that there is a strong focus on both problem-solving ability and automaticity in SA classrooms.
  • We were walking around with the school’s principal, and it was clear that she spends a great deal of her time moving in and out of classrooms observing. More than a passive observer, she interjected with pedagogical suggestions for the teacher in almost every class we visited. The teachers all seemed used to this kind of advice, and they implemented it immediately.

What I heard from Eva and her staff about curriculum and instruction in SA schools:

  • The curricula they use are all created in-house. They evaluated a bunch of textbooks in each subject and found them all wanting, so they created their own materials.
  • The math materials are influenced by TERC and Contexts for Learning. They do not use off-the-shelf math textbooks because they find them all highly redundant (something I’ve found in the context of instruction), the apparent assumption from publishers being kids won’t get it the first time (this was described as signifying publishers’ “low expectations”).
  • The ELA materials are based on close reading and analysis, and have been since the first SA school opened in 2006. The goals I heard were for students to 1) love literature and want to read, and 2) be able to understand what they’re reading. These goals are accomplished by a good deal of guided close reading instruction, child-chosen books (every classroom had a beautiful and well stocked library), and daily writing and revising in class. There seemed to be a clear and strong opposition to “skills-based” reading instruction.
  • The only off-the-shelf materials that they use in ELA and mathematics are Success for All’s phonics curriculum, which is used in grades K and 1.
  • Every kid in elementary grades gets inquiry-based science instruction every day. They have dedicated science teachers for this. They also get art, sports, and chess in the elementary grades.
  • The curriculum is uniform across the schools in the network. Every teacher teaches the same content on the same day. The lessons are not scripted, however. The curricula are revised at the network level every year.
  • A typical lesson is 10 minutes of introduction with students on the floor, some of which will be teacher lecture and some of which will be discussion; 30 minutes of students working individually or with partners; and 10 minutes of wrap up and additional discourse. The goal for the whole day is less than 80 minutes of direct instruction.
  • Teachers get tons of training, and the training is largely oriented toward curriculum and instruction. They also get 2 periods of common planning time with other grade-level teachers per day, and an afternoon to work together on planning and training.
  • The new New York state math test was much derided as too easy and not actually indicating readiness for success in high school and beyond.
  • There is not nearly as much of a testing and data-driven culture as I expected in this kind of school. Testing seems to legitimately be a means to an end, and I didn’t get the sense that lots of instructional time was used up in testing. Rather, judgments about student readiness seemed to be largely qualitative.
  • The only tracking that currently happens in network schools is in mathematics starting in middle school, where there are two tracks (regular and advanced).

So, that’s what I saw and what I heard. From and C&I standpoint, the things that really stood out to me were a) the organization, which made things flow smoothly and diminished distractions, b) the common content across classrooms (created by network staff and teachers), coupled with time to plan and share results, c) the involvement of the school leader in constantly observing instruction, and d) the, frankly, much more “progressive” and “student-led” approach to instruction than I envisioned.

It was a fascinating experience that I hope others can have.

Hufflin Muffin (or the craziness of textbook data)

Sorry for the absence; it’s been a crazy month.

The main work keeping me away from here continues to be my research on school districts’ textbook adoptions. Recently, we had a nice breakthrough in Texas, where we discovered that the state keeps track of districts’ textbook purchases through disbursement and requisition reports. Great news! A couple FOIAs later and we were in business.

But like everything in the textbook research business, each step forward leads to two steps back.

The disbursement dataset we got from the state contains information on the publishers of the textbooks purchased by each Texas school district. Not the titles themselves, but the publishers. You’d think that a dataset like this might be standardized in some way. After all, there are only so many publishers out there. And if you’re going to the trouble of collecting the data as a state, you might want to have the data be easily usable (either by you or by researchers).

Well, you’d be wrong. Very, very wrong. My student, Shauna Campbell, has been cleaning up these data. Below I have copied the list of different entries in the “publishers” variable that we believe correspond to Houghton Mifflin Harcourt. There are 313 as of today’s counting, and we expect this number to go up. I’ve bolded some of our personal favorite spellings.

What is the point of collecting data like this if it’s going to be so messy as to be almost unusable?

·       Hooughton Mifflin
·       Houfhron Mifflin Hartcourt
·       Houfhton Mifflin Harcourt
·       Houg
·       HOUG
·       hough
·       Hough Mifflin harcourt
·       Houghjton Mifflin Harcourt
·       Houghlin Mifflin Harcourt
·       Houghlton Mifflin Harcourt
·       Houghnton Mifflin Harcourt
·       Houghon Mifflin
·       Houghon Mifflin Harcourt
·       Houghrton Mifflin Harcourt
·       Hought Mifflin Hardcourt
·       Hought on Mifflin Hacourt
·       Houghtion Mifflin Harcourt
·       Houghtlon Mifflin Harcourt
·       Hought-Mifflin
·       Houghtn Mifflin
·       Houghtn Mifflin Harcourt
·       Houghtn Miffling Harcourt
·       Houghtno Mifflin Harcourt
·       Houghto Mifflin Harcourt
·       Houghtob Mifflin
·       Houghtom Mifflin
·       Houghtom Mifflin Harcourt
·       Houghtomn Mifflin Harcourt
·       Houghton
·       Houghton Mifflin
·       Houghton Mifflin Harcourt
·       Houghton MIfflin Harcourt
·       Houghton Mifflin Hardcort
·       Houghton & Mifflin
·       Houghton Harcourt Mifflin
·       Houghton Hiffin Harcourt
·       Houghton Hifflin
·       Houghton Hifflin Harcourt
·       Houghton Lifflin
·       Houghton McDougal
·       Houghton Mfflin Harcourt
·       Houghton Mfifflin
·       Houghton Miffen Harcourt
·       Houghton Miffflin Harcourt
·       Houghton Miffiin Harcourt
·       Houghton Miffilin
·       Houghton Miffilin Harcourt
·       Houghton Miffiln Harcourt
·       Houghton Miffin
·       Houghton Miffin Harcourt
·       Houghton Miffin Harcourt/Saxon Publis..
·       Houghton Mifflan
·       Houghton Mifflan Harcourt
·       Houghton Mifflein Harcourt
·       Houghton Miffliin Harcourt
·       Houghton Miffliln
·       Houghton Miffliln Harcouirt
·       Houghton Miffliln Harcourt
·       houghton mifflin
·       Houghton MIfflin
·       houghton Mifflin
·       Houghton mifflin
·       HOughton Mifflin
·       Houghton- Mifflin
·       Houghton Mifflin – Grade 1
·       Houghton Mifflin – Grade 2
·       Houghton Mifflin – Grade 3
·       Houghton Mifflin – Grade 4
·       Houghton MIfflin – Grade 4
·       Houghton Mifflin – Grade 5
·       Houghton Mifflin – Grade 6
·       Houghton MIfflin – Grade 6
·       Houghton Mifflin – Grade 7
·       Houghton Mifflin – Grade 8
·       Houghton Mifflin – Grade K
·       Houghton Mifflin Harcourt
·       Houghton Mifflin / Great Source
·       Houghton Mifflin and Harcourt
·       Houghton Mifflin Co
·       Houghton Mifflin Co.
·       Houghton Mifflin College Dic
·       Houghton Mifflin College Div
·       Houghton Mifflin Company
·       houghton mifflin company
·       Houghton Mifflin from Follett
·       Houghton Mifflin Geneva, IL 60134
·       Houghton Mifflin Grade 8
·       Houghton Mifflin Grt Souce ED Grp
·       Houghton Mifflin Hacourt
·       Houghton Mifflin Haecourt/Holt McDougal
·       Houghton Mifflin Haracourt
·       Houghton Mifflin Harccourt
·       Houghton Mifflin Harcocurt
·       Houghton Mifflin Harcort
·       Houghton Mifflin Harcount
·       Houghton Mifflin Harcour
·       Houghton Mifflin Harcourft
·       Houghton mifflin Harcourt
·       HOughton Mifflin Harcourt
·       Houghton Mifflin harcourt
·       Houghton mifflin harcourt
·       houghton Mifflin Harcourt
·       houghton Mifflin harcourt
·       Houghton Mifflin HArcourt
·       HOUGHTON Mifflin Harcourt
·       houghton mifflin Harcourt
·       Houghton Mifflin Harcourt – 9791300126
·       Houghton Mifflin Harcourt – 9791300173
·       Houghton Mifflin Harcourt – 9791300184
·       Houghton Mifflin Harcourt – Great Sou..
·       Houghton Mifflin Harcourt — Saxon
·       Houghton Mifflin Harcourt (Saxon)
·       Houghton Mifflin Harcourt (Steck Vaug..
·       Houghton Mifflin Harcourt (TEXTBOOK W..
·       Houghton Mifflin Harcourt / Holt McDo..
·       Houghton Mifflin Harcourt / Rigby
·       Houghton Mifflin Harcourt 9205 S. Par..
·       Houghton Mifflin Harcourt Achieve Pub..
·       Houghton Mifflin Harcourt Co
·       Houghton Mifflin Harcourt Co.
·       Houghton Mifflin Harcourt Depository
·       Houghton Mifflin Harcourt Great Source
·       Houghton Mifflin Harcourt -Holt
·       Houghton Mifflin Harcourt- Holt McDou..
·       Houghton Mifflin Harcourt- Holt McDoug
·       Houghton Mifflin Harcourt Holt McDoug..
·       Houghton Mifflin Harcourt Holt McDougal
·       Houghton Mifflin Harcourt Mifflin Har..
·       Houghton Mifflin Harcourt Publishing
·       Houghton Mifflin Harcourt Publishing ..
·       Houghton Mifflin Harcourt Publishing ..
·       Houghton Mifflin Harcourt Publishing Co
·       Houghton Mifflin Harcourt Rigby
·       Houghton Mifflin Harcourt Riverside
·       Houghton Mifflin Harcourt Saxon
·       Houghton Mifflin Harcourt School Publ..
·       Houghton Mifflin Harcourt Texas
·       Houghton Mifflin Harcourt/ Holt McDou..
·       Houghton Mifflin Harcourt/Great Source
·       Houghton Mifflin Harcourt/Holt
·       Houghton Mifflin Harcourt/Holt McDougal
·       Houghton Mifflin Harcourt/Saxon Publi..
·       Houghton Mifflin Harcourt-9791300126
·       Houghton Mifflin Harcourte
·       Houghton Mifflin HarcourtH
·       Houghton Mifflin HarcourtMH
·       Houghton Mifflin Harcourtq
·       Houghton Mifflin Harcourt-SHIPPING Co..
·       Houghton Mifflin Harcout
·       Houghton Mifflin Harcpurt
·       Houghton Mifflin Harcuort
·       Houghton Mifflin Harcurt
·       Houghton Mifflin Hardcourt
·       houghton mifflin hardcourt
·       houghton Mifflin Hardcourt
·       Houghton Mifflin Harocurt
·       Houghton Mifflin Harourt
·       Houghton Mifflin Harrcourt
·       Houghton Mifflin Hart Court
·       Houghton Mifflin Hartcourt
·       Houghton Mifflin Hartcourt Brace
·       Houghton Mifflin Holt Physics
·       Houghton Mifflin Holt Seventh Math
·       Houghton Mifflin Holt Sixth
·       Houghton Mifflin Holt Biology
·       Houghton Mifflin Holt Eighth
·       Houghton Mifflin Holt Eighth Math
·       Houghton Mifflin Holt Fifth
·       Houghton Mifflin Holt First
·       Houghton Mifflin Holt Fourth
·       Houghton Mifflin Holt IPC
·       Houghton Mifflin Holt Kindergarten
·       Houghton Mifflin Holt McDougal
·       Houghton Mifflin Holt Modern Chemistry
·       Houghton Mifflin Holt Second
·       Houghton Mifflin Holt Seventh
·       Houghton Mifflin Holt Sixth Math
·       Houghton Mifflin Holt Third
·       Houghton Mifflin Publishing
·       Houghton Mifflin Publishing Company
·       Houghton Mifflin School
·       Houghton Mifflin Science
·       Houghton Mifflin, Eds.
·       Houghton Mifflin, Indianapolis, IN 46..
·       Houghton Mifflin/Harcourt
·       Houghton Mifflin/Harcout
·       Houghton Mifflin/Holt McDougal
·       Houghton Mifflin/McDougal
·       Houghton Miffline
·       Houghton Miffline Harcourt
·       Houghton Miffline/Harcourt
·       Houghton Miffling Harcourt
·       Houghton MIffling Harcourt
·       Houghton Mifflin-Great Source
·       Houghton Mifflin-Great Source Rigby
·       Houghton MifflinHarcourt
·       Houghton MifflinHArcourt
·       houghton MifflinHarcourt
·       Houghton Mifflin-Harcourt
·       Houghton Mifflin-Holt McDougal
·       Houghton Mifflini Harcourt
·       Houghton Mifflinn Harcourt
·       Houghton Mifflin-using overage
·       Houghton Miffllin
·       Houghton Miffllin Harcourt
·       Houghton Miffln
·       Houghton Miffln Harcourt
·       Houghton Miflfin Harcourt
·       Houghton Miflin
·       Houghton Miflin Harcourt
·       Houghton Mifllin Harcourt
·       Houghton Migglin Harcourt
·       Houghton Miifflin Harcourt
·       Houghton Millflin
·       Houghton Mimfflin Harcourt
·       Houghton Misslin
·       Houghton Mofflin
·       Houghton Mufflin
·       Houghton Mufflin Company
·       Houghton Mufflin Harcourt
·       Houghton, Mifflin Harcort
·       Houghton, Mifflin HArcort
·       Houghton, Mifflin Harcourt
·       Houghton, Mifflin, and Harcourt
·       Houghton, Mifflin, Harcourt
·       Houghton-Miffliin Harcourt
·       Houghton-MIfflin
·       Houghton-Mifflin Co.
·       Houghton-Mifflin Company
·       Houghton-Mifflin Great Source Rigby
·       HoughtonMifflin Harcourt
·       Houghton-Mifflin Harcourt Senderos
·       Houghton-Mifflin/Great Source Rigby
·       Houghton-Mifflin/Harcourt
·       Houghton-Miffline Harcourt
·       HoughtonMifflinHarcourt
·       Houghton-Mifflin-Harcourt
·       Houghton-Mifflin-Harcourt;BMI;Lakesho..
·       HoughtonMifflinHarcourt-Holt McDougal
·       HoughtonMifflinHarcourt-Holt-McDougal
·       Houghton-Miflin
·       Houghtopn MIfflin Harcourt
·       HoughtotnMifflin
·       Houghtton Miffin Harcourt
·       Houghtton Mifflin Harcourt
·       Houghyon Mifflin Harcourt
·       Hougnton Mifflin Haracourt
·       hougnton Mifflin Harcourt
·       Hougthon MIfflin
·       Hougthon Mifflin Harcourt
·       Hougthon Mifflin harcourt
·       hougthon mifflin harcourt
·       Hougthon Mifflin Haroourt
·       Hougton
·       Hougton Mifflin
·       Hougton Mifflin Harcourrt
·       Hougton-Mifflin
·       Hougtton Mifflin Harcourt
·       Houhgton Mifflin Harcourt
·       Houhgton MIfflin Harcourt
·       Houhton Mifflin
·       Houhton Mifflin Harcourt
·       Houhton Mifflin Harcourt Saxon
·       Houlghton Mifflin Harcourt
·       Houlgthton Mifflin Harcourt
·       Houlgton Mifflin Harcourt
·       Hourghton Mifflin Harcourt
·       Hourhton Mifflin Harcourt
·       Houston Mifflin
·       Houtgton Mifflin Harcourt
·       Houthgon Mifflin Harcourt
·       Houthton Mifflin Harcourt
·       houthton mifflin harcourt
·       Houton-Mifflin Company
·       Hpoughton Mifflin Harcourt
·       Hpughton Mifflin Harcourt
·       Hughton Mifflin Harcourt
·       Huoghton Mifflin Harcourt
·       Harcourt Mifflin Harcourt
·       Harcourt/Houghton Mifflin
·       Saxon (Houghton Mifflin Harcourt)
·       Saxon / Houghton Mifflin 9205 S. Park..
·       Saxon HMH
·       Saxon Houghton Mifflin
·       Saxon Houghton Mifflin Harcourt
·       Saxon/Houghton Mifflin 9502 Sl Park C..
·       Saxon/Houghton Mifflin Harcourt
·       Saxon-H.M.H. refer to D000030293
·       Saxon-Houghton Mifflin

This study is based upon work supported by the National Science Foundation under Grant No. 1445654 and the Smith Richardson Foundation. Any opinions, findings, and conclusions or recommendations expressed in this study are those of the author(s) and do not necessarily reflect the views of the funders.

My quick thoughts on NAEP

By now you’ve heard the headlines–NAEP scores are down almost across the board, with the exception of 4th grade reading. Many states saw drops, some of them large. Here are my thoughts.

  1. These results are quite disappointing and shouldn’t be sugar-coated. Especially in mathematics, where we’ve seen literally two decades of uninterrupted progress, it’s (frankly) shocking to see declines like this. We’ve become almost expectant of the slow-but-steady increase in NAEP scores, and this release should shake that complacency. That said, we should not forget the two decades of progress when thinking about this two-year dip (nor should we forget that we still have yawning opportunity and achievement gaps and vast swaths of the population unprepared for success in college or career).
  2. Some folks are out with screeds blaming the results on particular policies that they don’t like (test-based accountability, unions, charters, Arne Duncan, etc.). Regardless of what the actual results were, they’d have made these same points. So these folks should be ignored in favor of actual research.[1] In general, people offering misNAEPery can only be of two types: (1) people who don’t know any better, or (2) people who know better but are shameless/irresponsible. Generally I would say anyone affiliated with a university who is blaming these results on particular policies at this point is highly likely to be in the latter camp.
  3. To a large extent, what actually caused these results (Common Core? Implementation? Teacher evaluation? Waivers? The economy? Opt-out? Something else?) is irrelevant in the court of public opinion. Perception is what matters. And the perception, fueled by charlatans and naifs, will be that Common Core is to blame. I wouldn’t be surprised if these results led to renewed repeal efforts for both the standards and the assessments in a number of states, even if there is, as yet, no evidence that these policies are harmful.

Overall, it’s a sad turn of events. And what makes it all the more sad is the knowledge that the results will be used in all kinds of perverse ways to score cheap political points and make policy decisions that may or may not help kids. We can’t do anything about the scores at this point. But we can do something about the misuse of the results. So, let’s.

[1] For instance, here’s a few summaries I’ve written on testing and accountability, and here’s a nice review chapter. These all conclude, rightly, that accountability has produced meaningful positive effects on student outcomes.

Do the content and quality of state tests matter?

Over at Ahead of the Heard, Chad Aldeman has written about the recent Mathematica study, which found that PARCC and MCAS were equally predictive of early college success. He essentially argues that if all tests are equally predictive, states should just choose the cheapest bargain-basement test, content and quality be damned. He offers a list of reasons, which you’re welcome to read.

As you’d guess, I disagree with this argument. I’ll offer a list of reasons of my own here.

  1. The most obvious point is that we have reasonable evidence that testing drives instructional responses to standards. Thus, if the tests used to measure and hold folks/schools accountable are lousy and contain poor quality tasks, we’ll get poor quality instruction as well. This is why many folks are thinking these days that better tests should include tasks that are much closer to the kinds of things we want kids to actually be doing. In that case, “teaching to the test” becomes “good teaching.” May be a pipe dream, but that’s something I commonly hear.
  2. A second fairly obvious point is that switching to a completely unaligned test would end any possible notion that the tests could provide feedback to teachers about what they should be doing differently/better. Certainly we can all argue that current test results are provided too late to be useful–though smart testing vendors ought to be working on this issue as hard as possible–but if the test is in no way related to what teachers are supposed to be teaching, it’s definitely useless to them as a formative measure.
  3. Chad’s analysis seems to prioritize predictive validity–how well do results from the test predict other desired outcomes–over all the other types of validity evidence. It’s not clear to me why we should prefer predictive validity (especially when we already have evidence that GPAs do better at that than most standardized tests, though SAT/ACT adds a little) over, say, content-related validity. Don’t we first and foremost want the test to be a good measure of what students were supposed to have learned in the grade? More generally, I think it makes more sense to have different tests for different purposes, rather than piling all the purposes into a single test.
  4. Certainly if the tests are going to have stakes attached to them, the courts require a certain level of content validity (or what they’ve called instructional validity). See Debra P. v. Turlington. If a kid’s going to be held accountable, they need to have had the opportunity to learn what was on the test. If the test is the SAT, that’s probably not going to happen.

Anyway, take a look at the Mathematica report (you should anyway!) and Chad’s post and let me know what you think.