data

A few posts ago I wrote about the challenges in getting a study of school textbook adoptions off the ground. Suffice it to say there are many. This post continues in that thread (half of it is just me complaining, the other half is pointing out the absurdity of the whole thing, so if you feel like you’ve already got the idea, by all means skip this one).

California is one state that makes textbook adoption data public, bucking the national trend. This is a result of the Eliezer Williams, et al., vs. State of California, et al. case, where a group of San Francisco students sued the state, convincing the court that state agencies had failed to provide public school students with equal access to instructional materials, safe and decent school facilities, and qualified teachers. As a consequence of this court case, California schools are required to provide information on these issues in an annual School Accountability Report Card (SARC). The SARCs are publicly available and can be downloaded here.

This is great! Statewide, school-level textbook data right there for the taking.

Well, not so fast. For starters, with the exception of about 15% of schools that use a standard form, the rest turn in their SARCs in PDF form. And the state doesn’t keep a database on those 85% (obviously). So the only way to get the information off the SARCs is to do it manually, by copying from PDFs [1]. Thus, over the course of the last year, with the support of an anonymous funder, we’ve been pulling these data together.

As it turns out, even when you can pull the data from the SARCs, there are at least three major problems:

Large numbers of schools that simply don’t have a SARC in a given year. Apparently there must be some kind of exemption, because this would seem to violate the court ruling otherwise.
For schools those that do have a SARC:
- Textbook data that are missing completely.
  - Or that are missing key elements, such as the grades in which they are used and the years of adoption.
- Listed textbook titles that are so vague (e.g., “Algebra 1”, when the state adopts multiple books with that title) or unclear (e.g., “McGraw Hill”, when that company publishes numerous textbooks) as to be somewhat useless.

As a consequence, like in the NSF study, we’ll be reaching out to all districts with non-complete data via email or phone to fill in the gaps.

Of course the data will never be perfect, and they’re better than is available anywhere else. But if the purpose of the court ruling is to provide some measure of public accountability through the clear reporting of this kind of information, it’s not clear to me that the SARCs are currently fulfilling that role. Perhaps the state doesn’t care to or doesn’t have the manpower to enforce the ruling. That’s unfortunate, not because it makes this research more challenging, but because it deprives disadvantaged students of the remedy that the court has decided they are due.

[1] For reference, there are around 1000 districts and 10000 schools in California.

A couple of weeks ago, before yours truly joined the blogosphere, the results for NAEP history, geography, and civics were released. Journalists and advocates around the nation reacted with their usual swift condemnation, noting the “flatlining”, “stagnant” performance. And it’s true, overall average scores on the newly released tests had not changed since their previous administration.

A few wise individuals, however, noticed that the scores had continued to increase when broken down by subgroup. Chad Aldeman penned the best defense, invoking Simpson’s Paradox to conclude that achievement is rising, and not by a trivial amount. In this case, Simpson’s Paradox means that the gains by individual subgroups (every subgroup is gaining in these subjects, and the largest gains are going to the historically most underserved groups) are masked when calculating overall averages because the typically lower-performing subgroups are increasing in numbers.

Jay Greene shot back in the comments section of Chad’s piece, arguing that Simpson’s Paradox was not an appropriate excuse here, because minority students are less difficult to educate now than minority students were 30 or 40 years ago, so making comparisons within groups is not necessarily appropriate.

I will actually take a middle ground here and say there is an element of truth to both arguments. This is because, in evaluating whether it’s better to focus on individual subgroups or the overall average in a case of Simpson’s Paradox, I find it useful to consider what the question of interest is.

As an example, consider the case of two airlines (American and United) operating at two airports (O’Hare and LAX). United flies 100 flights out of each airport with a 55% on-time rating from O’Hare and an 85% rating from LAX (thus, 70% overall). American flies 200 flights out of O’hare with a 60% on-time rating and 50 flights out of LAX with a 90% (thus, 66% overall). Now, if you were buying a ticket based on the aggregate statistics, you would choose United, because it has a higher overall on-time rate. But the overall average in this case is completely useless; it only applies to you if you pick your flights (including your departing airports) completely at random. If, instead, you pick your flights like a normal person by first choosing a departing airport and then choosing an airline, you are always better off choosing American. So in this case, the “subgroup” question is by far the more interesting one, and the “average” question is misleading and worthless.

To me, the primary question of interest with respect to NAEP is whether a given kid is likely to be better off now than he or she would have been 20 years ago. This is a subgroup question–we want to compare each kid to himself if he’d only been born 20 years earlier. Here, the answer is very clearly yes (with the possible exception of kids in extreme poverty). For all subgroups, NAEP achievement in all subjects continues to increase, as do high school graduation rates.

However, I can see the argument that the main question of interest is how the nation as a whole is doing, in which case it’s not overly relevant if the subgroups are making gains but the national average is not. The argument here basically says “the population is what it is, and we have to deal with that.”

Regardless of one’s view on Simpson’s Paradox in this particular case, I actually remain stunned and impressed by our students’ performance in subjects like geography and civics. Given that these are non-tested NCLB subjects (and thus have certainly seen reduced emphasis in classrooms), I find it remarkable that performance has not only not decreased, but actually has continued to tick up for all kinds of kids. This story, the nuanced version that includes attention to subgroups, is one that certainly needs to be told more often.

On Education Research

Gathering textbook adoption data part 2 (or: even when it should be easy, it ain’t)

A note on Simpson’s Paradox and NAEP