We write to you as researchers who study the design of school accountability systems and the construction of growth models. We read with interest the memorandum dated June 20, 2018, from Tom Torlakson and the CDE to you and the State Board. While we appreciate the analyses and effort underlying this memo, we have serious concerns about its claims and implications. Specifically, we believe the memo does not offer an accurate perspective on the strengths and limitations of various approaches to measuring schools’ contributions to student achievement. We are concerned that, if the State Board relies on this memo and keeps the current approach to measuring “change,” it will be producing and disseminating inadequate measure that will give California school leaders and educational stakeholders incorrect information about school effectiveness. In this brief response, we outline what we view as the shortcomings of the memo and make specific recommendations for an alternative approach to measuring school effects on student achievement.
We read the memo as making three main arguments against a Residual Gain (RG) model:
- The memo’s authors are concerned that the proposed RG model does not indicate how much improvement is needed to bring the average student up to grade-level standards, or whether achievement gaps are closing.
- The memo’s authors are concerned that the RG model allows schools to make positive “Change” from the prior year to current year, even if they make negative growth. The memo’s authors say this is counterintuitive and will confuse educators.
- The memo’s authors say that a RG model is volatile and should therefore not be used to make decisions, since decisions made in one year might be contradicted by the next year’s growth data.
Here, we respond to each of these concerns.
First, it is true that a residual gain model does not indicate how much improvement is needed to bring the average student up to standards. Models that attempt to indicate this are called “growth-to-proficiency” models. While perhaps appealing at first glance, these models do not measure school effectiveness. The reason is that they conflate the socioeconomic conditions of an area with “school effectiveness,” which is readily apparent when one considers how they work—namely, students in schools in high-poverty areas are much more likely to be below proficient, and thus required to make larger gains, than their peers who attend schools in low-poverty areas. Thus, growth-to-proficiency actually conveys very similar information to the state’s status measure (distance from level 3), which is not desirable for measuring the effectiveness of schools (Barnum, 2017). Building a growth measure around this metric would be largely redundant with the status measure and not in the spirit of measuring effectiveness in the first place.
Rather, what the state should aim for is a growth measure that comes as close as possible to capturing the true causal effect of schools on student achievement. For this goal, the most appropriate measures compare socioeconomically and demographically similar schools, and identify which schools produce students whose test scores improve the most (Barlevy and Neal, 2012; Ehlert et al., 2014, 2016).
Second, it is true that a residual gain model could provide different information from a “change” model that simply subtracts last year’s average score from this year’s. That is a feature of the system, not a bug. Among other reasons for this discrepancy, a “change” model does not adjust for changes in school composition. This is a problem with the “change” model and highlights an advantage of the residual gain model. There are many clear examples of how student growth models can be explained to educators and the general public, such as Castellano and Ho (2013). The state should follow these models.
Third, it is true that residual gain and other growth models can fluctuate somewhat from year to year. However, the year-to-year correlations of such models are positive and of a reasonable magnitude[1], indicating they provide consistent information. As long as high stakes decisions are not made on a single year’s scores, some degree of fluctuation is acceptable. Moreover, there are simple ways to reduce the year-to-year score fluctuations; any number of scholars could assist the state with developing these (examples include relying on moving averages, adjusting statistical significance bands, and even using Bayesian inference).
Based on our understanding of the research literature and of the goals of California’s system, we recommend that the state adopt a growth model that disentangles the student composition of a school from that school’s measured efficacy. There are many ways to do this—Ehlert et al. (2014, 2016) provide an overview of the main ideas. Models that properly account for student circumstances offer the best combination of validity (i.e., the output is more likely to reflect schools’ causal effects on student achievement) and interpretability (i.e., the output can be described in ways that educators can understand). Such models have been used in other states and school districts, including California’s CORE districts[2] and the states of Arkansas, Missouri, Colorado, and New York.[3]
The state’s current “change” model is unacceptable – it profoundly fails the validity test, and therefore it does not accurately represent schools’ contribution to student achievement. Indeed, it is not clear what it represents at all.
Should you have questions about our recommendations, we would be happy to discuss them.
Sincerely,
Morgan Polikoff, Associate Professor of Education, USC Rossier School of Education
Cory Koedel, Associate Professor of Economics and Public Policy, University of Missouri-Columbia
Andrew Ho, Professor of Education, Harvard Graduate School of Education
Douglas Harris, Professor of Economics, Tulane University
Dan Goldhaber, Professor, University of Washington
Thomas Kane, Walter H. Gale Professor of Education, Harvard Graduate School of Education
David Blazar, Assistant Professor, University of Maryland College Park
Eric Parsons, Assistant Research Professor of Economics, University of Missouri-Columbia
Martin R. West, Professor of Education, Harvard Graduate School of Education
Chad Aldeman, Principal, Bellwether Education
Richard C. Seder, Specialist, University of Hawaiʻi at Mānoa, Adjunct Assistant Professor, University of Southern California
Cara Jackson, Adjunct Faculty, American University
Aaron Tang, Acting Professor of Law, UC-Davis School of Law
David Rochman, Assessment and Evaluation Specialist, Orange County
Aime Black, Education consultant
Anne Hyslop, Education consultant
[1] https://faculty.smu.edu/millimet/classes/eco7321/papers/koedel%20et%20al%202015.pdf
[2] The CORE district growth model conditions on student demographics, which we recommend for purposes of fairness and validity; however, a similar model that only conditions on prior achievement would be nearly as good and would be a dramatic improvement over what the state is currently using.
[3] More information about the Arkansas, Missouri, New York, and Colorado models can be found here: http://www.arkansased.gov/public/userfiles/ESEA/Documents_to_Share/School%20Growth%20Explanation%20for%20ES%20and%20DC%20111017.pdf;