What the marriage equality ruling REALLY means for education

Rick Hess is out with an analysis of the implications of the Supreme Court’s ruling in Obergefell v. Hodges, last week’s landmark ruling that legalized same-sex marriage nationwide. While I think Rick is generally thoughtful, and he tells me that he is personally not opposed to marriage equality, this is among the more hysterical (in all senses of that word) posts I’ve read with respect to any education issue by someone as prominent as Rick. I hate to fall back on the same old technique of parsing each line of other people’s writing, but this piece simply demands that treatment. Suffice it to say that I think his piece is stunningly paranoid, and while I suspect a few of his post-apocalyptic fantasies about schools post-#LoveWins may come to fruition, most will not (and the ones that will are things that absolutely should happen and will benefit children).

He starts:

Like fascists, Communists, and boy-band producers, the American Left has always believed it could fine-tune human nature if it could only “get ’em while they’re young.” That’s why the Left works so hard to impose its will on schools and universities.

I mean, I don’t know where to begin. Yes, of course liberal people try to persuade young folks that our positions are better than conservative positions (as, in fact, they are). Conservatives do this too. What’s your point? And on the issue of marriage equality, the conservative position has lost very, very badly, and that’s with virtually no school-based indoctrination that I can think of (if anyone’s to blame for this extremely positive outcome, it’s probably the media).

As John Dewey, America’s high priest of educational progressivism, explained in 1897, the student must “emerge from his original narrowness” in order “to conceive of himself” as a cog in the larger social order.

I don’t know what this means. But it sounds spooky.

Last week’s gay-marriage ruling will yield a new wave of liberal efforts to ensure that schools do their part to combat wrong-headed “narrowness.” Justice Anthony Kennedy’s sweeping 5–4 decision in Obergefell v. Hodges opened by declaring, “The Constitution promises liberty to all within its reach, a liberty that includes certain specific rights that allow persons, within a lawful realm, to define and express their identity.” Kennedy took pains to opine that marriage “draws meaning from related rights of childrearing, procreation, and education.” In finding that the Fourteenth Amendment secures the right to “define and express [one’s] identity,” the Obergefell majority has issued a radical marker. (If gay marriage had been established by democratic process, things might have played out in a more measured manner.)

This “democratic process” thing is a canard, plain and simple. First off, public support was already on our side. Second off, that’s not how you do equal rights. You don’t wait around and let people vote to see if a minority gets a fair shake. Couples in 13 states were waiting. Suppose one of them dropped dead while we were waiting around for the majority to give them their rights, and as a result they were denied spousal estate benefits. I guess the answer from the Alito camp is “fuck ’em,” but I think most people would say that it’s abhorrent to sit on our hands and wait until the majority decides it’s finally time to grant people their constitutional rights.

Justice Samuel Alito predicted, “Today’s decision . . . will be used to vilify Americans who are unwilling to assent to the new orthodoxy,” and “they will risk being labeled as bigots and treated as such by governments, employers, and schools.” Alito is almost assuredly right, and that poses serious questions for schools and colleges.

Alito is indeed right. I’ve been calling opponents of same-sex marriage bigots for a long time, because that’s a bigoted view. Though I don’t generally call them that to their faces, because I find that’s not a strong debate tactic. We all have the right to express ourselves, but we don’t have the right to be absolved of the consequences of that expression. If I said I thought interracial marriage shouldn’t be allowed, I would rightly be called a bigot. So, yes, people and businesses and states that do things that are bigoted will probably see negative responses.

At the collegiate level, the implications are pretty clear — especially for religious institutions. Christian colleges are going to find their nonprofit tax status under assault unless they agree to embrace gay marriage. (The relevant precedent is the 1983 Supreme Court ruling that enabled the IRS to strip Bob Jones University’s tax-exempt status because of the school’s ban on interracial dating.)

Well, yes, institutions that take federal funds and use them to violate the Constitution shouldn’t get those funds anymore. I doubt there will be a huge rush to hold gay weddings in Bob Jones’ chapel, but I certainly could be wrong. If there is, and if a religious institution decides it can’t avoid violating my constitutional rights, then that institution should be prosecuted. That’s how this works (though I’m no Constitutional scholar).

Policies regarding “family housing,” employee benefits, use of chapels for marriages — all will come under fire. And then we’ll start getting to questions of readings, campus programs, and curriculum, where familiar First Amendment rights will clash with the new Fourteenth Amendment right to “define and express [one’s] identity.” For religious colleges stripped of their nonprofit status, many — if not most — will be compelled to close their doors. (It’s safe to say that plenty of progressives would regard this development as a bonus).

I am agnostic on whether any particular college stays open or is closed. I don’t think it’s likely that the federal government will suddenly become closely involved in colleges’ readings or curricula–is that something that happens now?

More broadly, the Chronicle of Higher Education reports that gay-rights advocates believe the decision will “help them move on to other issues, such as access to higher education and mental-health concerns for young LGBTQ students of color and transgender students of color.” Shane Windmeyer, executive director of Campus Pride, said, “I’m hopeful we can now say we won one game; now the next game is looking at trans rights, how we treat queer people of color, especially first-generation LGBTQ students of color.”

Mental health care for students?!? The horror!

LBGT crusaders are also pushing for big changes in K–12 public schooling. Education Week’s legal-affairs reporter noted that the decisions “holds various implications for the nation’s schools, including in the areas of employee benefits, parental rights of access, and the effect on school atmosphere for gay youths.” I can’t say with certainty what’s coming. But here are four things to watch for. Educators have long celebrated “diversity.” Now they can expect heightened pressure to do more, and to ensure that nothing stymies a student’s “identity.” When a tiny handful of social crusaders complain that this play feels too stereotypically masculine or that those stories don’t include enough LGBT students, they’re going to pull Obergefell out of their pocket. Things will prove particularly contentious in history, where a dearth of gay marriages and nontraditional families will invite creative efforts to “balance” things out.

This is some really bogeyman stuff. I guess it’s bad if we have a curriculum that represents the diversity of our students? I fail to see how a ruling that my love is the same as Rick’s will have ripple effects in terms of causing schools to make dramatic curriculum changes in favor of more gay inclusion. That trend is probably already happening in liberal places. But again, even if that did happen, it’s almost certainly a good thing, especially for gay kids who are more likely to be bullied and commit suicide. Still, I’d bet that in the vast majority of non-ultra-liberal places, nothing like this will come to pass any time soon.

School leaders have judged that American flag T-shirts are unacceptably provocative when worn on Cinco de Mayo. Clothing and artifacts perceived as hostile to another’s “defined and expressed” identity, such as badges of religiosity, may well come under the closest of scrutiny. After all, the Court has long held that freedom of speech and religion may be circumscribed in educational settings. Now, protestations on behalf of free expression and free speech can be answered with Fourteenth Amendment claims.

If he’s talking about a student wearing something that displays a cross, I can’t see that coming under any more fire than it would have pre-Obergefell. If he’s talking about a shirt that says “homosexuality is not okay,” then yes, that shouldn’t be worn in a school (just as we would not allow a student to wear a shirt that says “women belong in the kitchen,” “men are rapists” or “white people are racists”). Getting those things out of schools will absolutely make schools a better place for children.

Expect demands for schools to amp up their efforts to feature “nontraditional” families in all kinds of contexts. Schools may be scrutinized for the mixture of families that wind up in posters, brochures, student art displays, instructional materials, and the rest. Failure to include a satisfactory percentage of gay parents (or other nontraditional family groupings) may be judged evidence of a hostile environment.

The first sentence is probably true, though a continuation of existing trends (have you seen advertising recently? Big corporations were ALL OVER this ruling, very clearly showing the business community thinks this decision was the right one [perhaps for their bottom line, but whatever]). The rest of this is absurd. Companies naturally want to include images of diversity on their products because, you know, we’re a really diverse country (and it will probably also result in better sales). There will not be a gay family gestapo that goes around counting posters with straight vs. gay couples in them.

And casual language will have to change. Teachers may instinctively ask a volunteer father about his wife or mention mothers and fathers; when they do, it won’t be long until a sensitive parent decides that this kind of “heteronormativity” is an unconstitutional violation of their identity. Pity the poor assistant principal who knows two parents are attending a meeting and mistakenly asks the woman sitting in the office if her “husband” is running late — rather than asking about her “spouse.” In the wrong circumstances, that could be a career-ender. Minimizing such mistakes means schools will soon be at pains to replace the terminology of “moms and dads” with that of genderless dyads.

Yes, language will slowly change, as people stop assuming things about other people. That’s a good thing, obviously. Speaking as someone who gets asked about my wife all the time (no, really, and I think I’m about a 12 on the Kinsey scale), I can tell you it doesn’t bother me in the slightest when it happens. I simply respond with “My husband does XYZ,” and the person realizes that I’m, in fact, married to man. And it’s no big deal, because I’m an adult and people make assumptions. No one’s going to get fired because they accidentally use the H word rather than the W word. Now if they respond with “Oh, you’re married to a man? That’s disgusting, you sinful pervert,” and if I have authority over their job, then yes, it might be a problem for them. But otherwise, come on. This is again totally paranoid and simply will not happen in any reasonable number of cases (and certainly not more than the number of gay people who are discriminated against in employment every day in this country, because that’s perfectly legal in the majority of states).

America’s principals, superintendents, and school boards generally don’t have a lot of stomach for waging these fights. Even those who hate being bullied don’t want the exhausting slog or public criticism. Far more likely is that they’ll pack it in, lending Justice Kennedy’s rhetorical flourishes a practical import even he may not have imagined.

Translation: “Bigoted people will realize that being bigoted and suffering the consequences probably isn’t worth it, so they’ll be less bigoted or just internalize their bigotry.” Another positive outcome! And, actually, I would be almost certain that support for this decision is higher among educators than the general public, as I think the vast majority of educators do not hold bigoted views.

The long and the short of it is that there’s really no “there” there with any of this stuff. Most of it simply will not happen, and the stuff that will happen will make our schools better for kids. And more to the point, what the consequences are for schools are mostly irrelevant to the merits of the case. And on that, we are moving toward consensus–that my marriage is the same as Rick’s. Hopefully the readers at NRO will soon join the 60% of us who already know that to be true.

It Is Accomplished

On a beautiful day like today, I leave it to Andrew Sullivan to summarize what so many of us are feeling. Love is love.

The Dish


As Gandhi never quite said,

First they ignore you. Then they laugh at you. Then they attack you. Then you win.

I remember one of the first TV debates I had on the then-strange question of civil marriage for gay couples. It was Crossfire, as I recall, and Gary Bauer’s response to my rather earnest argument after my TNR cover-story on the matter was laughter. “This is the loopiest idea ever to come down the pike,” he joked. “Why are we even discussing it?”

Those were isolating  days. A young fellow named Evan Wolfson who had written a dissertation on the subject in 1983 got in touch, and the world immediately felt less lonely. Then a breakthrough in Hawaii, where the state supreme court ruled for marriage equality on gender equality grounds. No gay group had agreed to support the case, which was regarded at best as hopeless and at…

View original post 1,089 more words

The story of Speedometry

As I mentioned in a previous post, one of my current projects is Speedometry–a collaboration between USC Rossier researchers and the Mattel Children’s Foundation to develop units that use Hot Wheels to teach kids science and math content. It’s been a fun, wildly different project, and I’ve learned a ton.

Over here, USC Rossier’s online teaching degree curated a blog series on the Speedometry program. It’s a four-part post, and my part describes all the products we’ve created as part of the work. If you’re interested in these kinds of partnerships, I think it will be an interesting series of posts.

Testing tradeoffs

Life is a series of tradeoffs. Perhaps nowhere in education is that clearer than in assessment policy.

What brings this to mind are Motoko Rich’s and Catherine Gewertz’s recent articles about scoring Common Core tests. I think both of these articles are good, and they both illustrate some of the challenges of doing what we’re trying to do at scale. But it’s also clear that some anti-test folks are using these very complicated issues as fodder for their agendas, and that’s disappointing (if totally expected). Here are some of the key quotes from Motoko’s article, and the tradeoffs they illustrate.

On Friday, in an unobtrusive office park northeast of downtown here, about 100 temporary employees of the testing giant Pearson worked in diligent silence scoring thousands of short essays written by third- and fifth-grade students from across the country. There was a onetime wedding planner, a retired medical technologist and a former Pearson saleswoman with a master’s degree in marital counseling. To get the job, like other scorers nationwide, they needed a four-year college degree with relevant coursework, but no teaching experience. They earned $12 to $14 an hour, with the possibility of small bonuses if they hit daily quality and volume targets.

Tradeoff: We think we want teachers to be involved in the scoring of these tests (presumably because we believe there is some special expertise that teachers possess) [1]. But teachers cost more than $12 to $14 an hour, and we’re in an era where every dollar spent on testing is endlessly scrutinized, so we have to instead use some educated people who are not teachers.

At times, the scoring process can evoke the way a restaurant chain monitors the work of its employees and the quality of its products. “From the standpoint of comparing us to a Starbucks or McDonald’s, where you go into those places you know exactly what you’re going to get,” said Bob Sanders, vice president of content and scoring management at Pearson North America, when asked whether such an analogy was apt.

Tradeoff: We have a huge system in this country, and we want results that are comparable across schools. But comparability in a large system requires some degree of standardization, and standardization at that level of scale requires processes that look, well, standardized and corporate.

For exams like the Advanced Placement tests given by the College Board, scorers must be current college professors or high school teachers who have at least three years of experience teaching the subject they are scoring.

Tradeoff: We want to test everyone. This means the volume for scoring is tremendously larger than the AP exam (about 12 million test takers vs. about 1 million), which again means we may not be able to find enough teachers to do the work.

“You’re asking people still, even with the best of rubrics and evidence and training, to make judgments about complex forms of cognition,” Mr. Pellegrino said. “The more we go towards the kinds of interesting thinking and problems and situations that tend to be more about open-ended answers, the harder it is to get objective agreement in scoring.”

Tradeoff: We want more challenging, open-ended, complex tasks. But scoring those tasks at scale is harder to do reliably.

There are of course other big tradeoffs that aren’t highlighted in these articles. For instance:

  • The tradeoff between test cost and transparency–building items is very expensive, so releasing items and having to create new ones every year would add to test costs while enhancing transparency.
  • The tradeoff between testing time and the nature of the task–multiple choice items are quicker to complete, but they may not fully tap the kinds of skills we want to measure.
  • The tradeoff between testing time and the comprehensiveness of the assessment–shorter tests can probably give us a reasonable estimate of overall math and reading proficiency, but they will not give us the fine-grained, actionable data we might want to make instructional responses (and they might contribute to “narrowing the curriculum” if they repeatedly sample the same content).
  • The tradeoffs of open-response items with fast scoring–multiple choice items, especially on computers, can be scored virtually instantaneously, whereas open-response items take time to score. So faster feedback may butt up against our desire for better items.
  • The tradeoffs associated with testing on computers–e.g., using money to purchase computers vs. other things, advantages of adaptive testing vs. needing to teach kids how to take tests on computers.

I will also note that this kind of reporting could, in my mind, be strengthened with more empirical evidence. For instance,

“Even as teachers, we’re still learning what the Common Core state standards are asking,” Ms. Siemens said. “So to take somebody who is not in the field and ask them to assess student progress or success seems a little iffy.”

Are teachers better scorers than non-teachers, or not? That’s an empirical question. I would be reasonably confident that Pearson has in place a good process for determining who are the best scorers from the standpoint of reliability. Some of the best scorers are teachers, and some are not.

Some teachers question whether scorers can grade fairly without knowing whether a student has struggled with learning difficulties or speaks English as a second language.

Is there evidence that the test scoring is biased against students with disabilities or ELLs, or not? That’s also an empirical question. Again I would guess that Pearson has in place a process to weed out construct-irrelevant variance to the maximum extent possible.

Overall, I think it’s great that writers like Motoko and Catherine are tackling these challenging issues. But I hope it’s not lost on readers that, like everything in life, testing requires tradeoffs that are not easily navigated.

[1] It’s not obvious to me this is true, though it may well be. Regardless, it would likely be a good professional development opportunity to score items.

On New Orleans and media criticism

Over at This Week in Education, John Thompson pens a post about his perceptions of bias in Washington Monthly’s recent report about New Orleans schools. As there’s a big conference going on in NOLA right now on this very topic (which I sadly could not attend (though perhaps not so sadly given it’s June in New Orleans), but you should watch the live streams tomorrow–link above), I figured I’d give this one a thorough read and respond in a measured way. So that’s what I’m doing here. Again, I don’t think of myself as much of a choice fan–it’s far from my main interest in either policy or research–but the logical and rhetorical problems with this kind of writing are, to my eye, so manifest that they really need to be addressed.

John starts by praising Caitlin Emma’s reporting on New Orleans, quoting her saying there’s no proof the New Orleans model works [1]. He contrasts Emma to Osborne’s report, which concludes “the Crescent City’s schools have produced what some experts believe to be the most rapid academic improvement in American history,” which he goes on to trash in extended detail.

His critique begins: “Osborne starts with the dubious claim by the pro-charter CREDO that charters receive less per student funding…”

I’ll stop you right there and remind you that the first CREDO report found that charter schools underperformed traditional publics. Only the recent reports have found charter improvement. So unless by “pro-charter” he means “uses advanced statistical methods and concludes that charters marginally outperform traditional public schools in recent reports but not in earlier reports,” this characterization of CREDO is absurd [2].

Next, Thompson says:

He cites the objective researcher, Douglas Harris, who says that NOLA undertook “the most radical overhaul of any type in any school district in at least a century.” But, Osborne cites no evidence by Harris or anyone else that the New Orleans radicalism can work in a sustainable manner or that it could be scaled up. Instead, he devotes almost all of his article to praising true believers in unproven theories on school improvement.

I guess this is dancing around what the article actually said. What the article said was things like the following:

Before Katrina, most public schools were terrible. In 2005 the city ranked sixty-seventh out of sixty-eight districts in Louisiana, itself a low performer compared to other states. Last year, New Orleans was forty-first out of sixty-nine school districts in Louisiana.

Before Katrina, some 62 percent of students attended schools rated “failing” by the state. Though the standard for failure has been raised, only 7 percent of students attend “failing” schools today.

Before Katrina, only 35 percent of students scored at grade level or above on state standardized tests. Last year 62 percent did.

Before Katrina, almost half of New Orleans students dropped out, and less than one in five went on to college. Last year, 73 percent graduated from high school in four years, two points below the state average, and 59 percent of graduates entered college, equaling the state average.

And according to a 2015 CREDO study, between 2006 and 2012 New Orleans’s charter students gained nearly half a year of additional learning in math and a third of a year in reading, every year, compared to similar students in the city’s non-chartered public schools.

Because the OPSB was only allowed to keep schools that scored above the state average, the failing schools were all in the RSD. In the spring of 2007, the first full school year after Katrina, only 23 percent of RSD students tested at or above grade level. Seven years later, fully 57 percent did. As Figure 1 shows\, RSD students in New Orleans have improved almost four times faster than the state average.

Now, you might argue with those statistics–that they’re based on creaming, or that the poorest of the poor have been driven out of NOLA, or some other critique (though my read of the evidence on this is pretty clear). But they’re not no evidence. They’re actually quite a bit of evidence. And other work by folks like Josh Angrist (arguably one of the strongest methodologists around) finds big effects of takeovers in New Orleans, too. That looks relatively “sustainable” to me, a decade after Katrina. Perhaps it wouldn’t work elsewhere, but it’s not nothing.

Or maybe, John says, it is nothing! Maybe the test score gains identified by CREDO and mentioned in Osborne’s study are meaningless.

Improved test scores in such schools might or might not be meaningful. In a situation like that, is there any reason to believe that increased test scores mean that more learning occurs when all stops are pulled from test prep in a C school, as opposed to a D or F school? Rounds of such remediation are bound to improve metrics important to adults, but do they help or hurt the children who endure them?

Rhetorical questions are often a sign that there’s not a good argument being made, and this is no exception. Because, again, if the charter schools had underperformed the district schools on the tests, John would be using it as fodder (I’ve seen enough of these pieces to know). And at least the serious ones among us know that, while test scores do not measure everything, they also do not measure nothing. And even if they did measure nothing, the article says graduation rates are way up, too (and I’ve heard attendance as well)!

Thompson says that he, for one, is withholding judgment on NOLA’s reforms until Doug Harris comes out with ERA’s report on the city. I, too, think Doug is an excellent researcher who does not have an agenda other than getting the facts right. If the facts come back that charters are outperforming traditional public schools in New Orleans, you can bet your bottom dollar there won’t be a followup post about how the reforms were right all along. [EDITED TO ADD: Doug gave his keynote presentation at the ERA conference today. The conclusion was positive impacts on test scores of .2 to .4 standard deviations.]

I’ll conclude by quoting my two favorite paragraphs from the piece. I’m not going to tear them apart, because they do that to themselves. But I will note something that has bothered me for quite a while. John’s blog post is on This Week in Education, a site run by Alexander Russo. Russo also runs The Grade, where he critiques education media (he also does this on Twitter, and I think it’s quite valuable in both contexts). I have never understood, and I never will, how these two things can be reconciled. To my eye, Thompson’s posts betray an agenda that will not change with any amount of research evidence. How this kind of writing ends up in a prominent position on the blog of a media critic boggles my reformy mind:

Osborne doesn’t acknowledge the much more likely scenario. Under such a Social Darwinian system, survival will go to the best of the test score fabricators. Market-driven reformers will do what they have done best since NCLB imposed primitive bubble-in accountability. They will treat children as test scores. Or worse, they will treat them as dollar signs. Either way, competition-driven reform will likely continue to damage the poorest children of color.

More selective charters might or might not try to offer a holistic education to the more motivated students. Those that do will be showcased to spin corporate reform with a more human face. Charters with the most challenging students will continue to do what it takes to survive, and twist the facts to non-education reporters and politicians.

[1] Editor’s note: if Emma were to write something a year from now that concluded the reforms had produced improvements, he’d undoubtedly cast her aside like day-old bagels.

[2] And again, this is a case where if CREDO had concluded that charters were underperforming, he’d be gloating and touting those findings.

Some quick thoughts on opt out

In general, I have not opined much on the subject of “opt out,” for a number of reasons. First, there’s little/no good data or research on the topic, so my opinions can’t be as informed as I would typically like them to be. Second, I don’t know that I have much to add on the issue (and yet I’m about to give my two cents). Third, it’s a trend that actively worries me as someone who believes research clearly shows that tests and accountability have been beneficial overall. I don’t really see much policymakers can do to stop this trend short of requiring public school students to test [1].

Despite my best efforts to avoid the subject, over on Twitter, former MCPS Superintendent Joshua Starr asked me what I think of this EdWeek commentary on opt out. Here are some excerpts of their argument and my reactions.

First, the title is “Test-taking ‘compliance’ does not ensure equity.” Probably the authors did not write this title, but it’s a very weak straw man. I know of few/any folks who believe that test-taking compliance ensures equity. I certainly don’t believe that. I do believe having good data can help equity, but it certainly doesn’t ensure it.

Some parents have elected to opt their children out of the annual tests as a message of protest, signaling that a test score is not enough to ensure excellence and equity in the education of their children. Parents, they insist, have a right to demand an enriched curriculum that includes the arts, civics, and lab sciences, and high-quality schools in their neighborhoods.

I don’t have good evidence on this (I don’t think anyone yet does, but hopefully several savvy doctoral students are work on this topic), but my very strong sense is that the folks opting out of tests are not typically doing it as an equity protest. Everything I’ve seen and heard so far says this is largely, but not exclusively, a white, upper-middle class, suburban/rural phenomenon [EDITED TO ADD: Matt Chingos has done a preliminary analysis of this issue and largely agrees with this characterization: http://www.brookings.edu/research/papers/2015/06/18-chalkboard-who-opts-out-chingos%5D. My conversations with educators in California, for instance, suggest that the high rates of opt-out in high schools in some affluent areas are because the exam was seen as meaningless and interfering with students’ abilities to prepare for other exams that actually matter to students (e.g., APs, SAT).

Since it was signed into law in 2002, No Child Left Behind has done little to advance the educational interests of our most disadvantaged students. What’s more, the high-stakes-testing climate that NCLB created has also been connected to increased discipline rates for students of color and students with disabilities.

I think the first sentence there is not correct–as I showed in the previous post, there’s evidence that achievement has increased due to NCLB for all groups, including the most disadvantaged (but not much evidence it has narrowed gaps). I’m not aware of well designed research showing the latter claim, but that’s not my area. Regardless, as I also discussed in the last post, sweeping claims of harm to disadvantaged students are hard to square with empirical evidence on outcomes such as test scores and graduation rates.

And even after these tests reveal large outcome gaps, schools serving poor children of color remain underfunded and are more likely to be labeled failing. Most states have done nothing to intervene effectively in these schools, even when state officials have taken over school districts. Moreover, despite NCLB’s stated goal of closing the achievement gap, wide disparities in academic outcomes persist.

I think this is mostly true, though of course it depends on state (some states are much more adequate and equitable in their funding than others). And the lack of intervention in low-performing schools really is about a lack of effective intervention, though I’d be very curious what interventions these authors would recommend. It’s true that achievement gaps persist, though I believe racial (but not income) gaps are about as small now as they’ve ever been.

We are not opposed to assessments, especially when they are used for diagnostic purposes to support learning. But the data produced by annual standardized tests are typically not made available to teachers until after the school year is over, thereby making it impossible to use the information to respond to student needs.

Some of the new state tests get data back faster. For instance, some California results were made available to teachers before the end of the year. In general I think it’s a bad idea to heap too many different goals for a single test. It’s not clear to me that we always want our accountability test to also be our formative, immediate feedback test–those probably should be different tests. But that doesn’t necessarily obviate the need for an external accountability test.

Thus, students of color are susceptible to all of the negative effects of the annual assessments, without any of the positive supports to address the learning gaps. When testing is used merely to measure and document inequities in outcomes, without providing necessary supports, parents have a right to demand more.

Again I think the intention of both the original NCLB and the waivers was that, in the early years of school “failure” students would be provided with additional supports and options (e.g., through supplemental education services and public school choice) to improve. Those turn out not to have worked, and perhaps future supports will not either, but it’s not necessarily for lack of effort. I’m curious what specific supports these authors would advocate, bearing in mind the intense hostility among half our nation to raising any additional funds for schools or anything else.

The civil rights movement has never supported compliance with unjust laws and policies. Rather, it has always worked to challenge them and support the courageous actions of those willing to resist. As young people and their allies protest throughout the country against police brutality, demanding that “black lives matter,” we are reminded that the struggle for justice often forces us to hold governments and public officials accountable to reject the status quo. Today’s status quo in education is annual assessments that provide no true path toward equity or excellence.

This strikes me as a stretch, though I agree with the first half of it. I’m not sure the “black lives matter” movement was really about holding the government accountable to reject the status quo, as much as it was about holding both government and individuals accountable for centuries of unjust laws and actions (but this is not remotely my area).

The anti-testing movement will not be intimidated, nor is it going away.

I think that’s right. Though reducing or eliminating teacher accountability based on state tests would probably at least reduce the extent to which the unions are actively encouraging opt-outs.

Some may choose to force districts to adopt a more comprehensive “dashboard” accountability system with multiple measures. Others may push districts to engage in biennial or grade-span testing, and still others may choose to opt out. What remains clear is that parents want more than tests to assess their children’s academic standing and, as a result, are choosing to opt out of an unjust, ineffective policy.

With respect to the first sentence, some states did this (and all states had the opportunity to do this in their waivers). With respect to the sentence, it’s not clear to me how biennial or grade-span testing is any more “just” than yearly testing. Perhaps if these authors stated what they think is the optimal testing regimen from a “justice” perspective, that would help.

So, I don’t think it’s an especially convincing argument. But I don’t know that the pro-opt-out movement really needs convincing arguments. If parents have the right to opt their kids out of tests, at least some of them will do so. I suspect this will lead to increased inequity, but that’s an empirical question for another day.

[1] Were I omnipotent, I would enact that rule, and I’d also require private and homeschool kids to test.

A (quick, direct, 2000 word) response to Tucker on testing

There’s been a bit of a kerfuffle recently in the edu-Twittersphere, since Marc Tucker suggested that civil rights leaders ought to reconsider their support for annual testing [1]. Kati Haycock and Jonah Edelman wrote impassioned responses, which Tucker has just dismissed as not responding to his substantive arguments. He ends with this paragraph:

The facts ought to count for something. What both of these critiques come down to is an assertion that I don’t have any business urging established leaders of the civil rights community to reconsider the issue, that I simply don’t understand the obvious—that annual accountability testing is essential to justice for poor and minority students, that anyone who thinks otherwise must be in the pocket of the teachers unions.  Well, it is not obvious. Indeed, all the evidence says it is not true. And anyone who knows me knows that I am in no one’s pocket. I know the leaders of the civil rights community to be people of great integrity.  They aren’t in anyone’s pocket, either. I think they want what is best for the people they represent. And I do not think that is annual testing.

I think Mr. Tucker greatly overstates the evidence in his initial post, so I’m going to do my best to give a very brief and direct response to the substantive arguments he makes there. I do this not to defend Haycock and Edelman (whom I do not really know), but to defend the policy, which I believe is unfairly maligned in Tucker’s posts.

Let me start by saying that I am generally in favor of annual testing, though I am probably not as fervid in that support as some others in the “reform” camp. I do not believe that annual accountability testing is essential to justice for poor and minority students, but I do think high-quality tests at reasonable intervals would almost certainly be beneficial to them.

Okay, here goes.

1) In his initial post, Marc Tucker says,

First of all, the data show that, although the performance of poor and minority students improved after passage of the No Child Left Behind Act, it was actually improving at a faster rate before the passage of the No Child Left Behind Act.

That link is to a NAEP report that indeed provides descriptive evidence supporting Tucker’s point. However, there are at least two peer-reviewed articles using NAEP data that show positive causal impacts of NCLB using high-quality quasi-experimental design, one on fourth grade math achievement only and the other on fourth and eighth grade math achievement and (suggestively) fourth grade reading. The latter is, to my eye, the most rigorous analysis that yet exists on this topic. There is a third article that uses cross-state NAEP data and does not find an impact, but again the most recent analysis by Wong seems to me to be the most methodologically sophisticated of the lot and, therefore, the most trustworthy. I think if Tucker wants to talk NAEP data, he has to provide evidence of this quality that supports his position of “no effect” (or even “harm,” as he appears to be suggesting). Is there a quality analysis using a strong design that shows a negative impact on the slope of achievement gains caused by NCLB? I do not know of one.

I should also note that there are beaucoup within-state studies of the impacts of accountability policies that use regression discontinuity designs and find causal impacts. For instance: in North Carolina, in Florida, and in Wisconsin. In short: I don’t see any way to read the causal literature on school accountability and conclude that it has negative impacts on student achievement. I don’t even see any way to conclude it has neutral impacts, given the large number of studies finding positive impacts relative to those with strong designs that find no impacts.

2) Next, Tucker says:

Over the 15-year history of the No Child Left Behind Act, there is no data to show that it contributed to improved student performance for poor and minority students at the high school level, which is where it counts.

Here I think Mark is moving the goalposts a bit. Is high school performance of poor and minority students the target? Then I guess we may as well throw out all the above-cited studies. I know of no causal studies that directly investigate the impact on this particular outcome, so I think the best he’s got is the NAEP trends. And sure, trends in high school performance are relatively flat.

I’m not one to engage in misNAEPery, however, so I wouldn’t make too much of this. Nor would I make too much of the fact that high school graduation rates have increased for all groups (meaning tons more low-performing students who in days gone by would have dropped out are still around to take the NAEP in 12th grade, among other things). But I would make quite a bit of the fact that the above-cited causal studies obviously also apply to historically underserved groups (that is, while they rarely directly test the impact of accountability on achievement gaps, they very often test the impacts for different groups and find that all groups see the positive effects). And I would also note some evidence from North Carolina of direct narrowing effects on black-white gaps.

3) Next, we have:

Many nations that have no annual accountability testing requirements have higher average performance for poor and minority students and smaller gaps between their performance and the performance of majority students than we do here in the United States.  How can annual testing be a civil right if that is so?

There’s not much to say about this. It’s not based on any study I know of, certainly none that would suggest a causal impact one way or the other. But he’s right that we’re relatively alone in our use of annual testing, and therefore that many higher-achieving nations don’t have annual testing. They also don’t have many other policies that we have, so I’m not sure what’s to be learned from this observation.

4) Now he moves on to claim:

It is not just that annual accountability testing with separate scores for poor and minority students does not help those students.  The reality is that it actually hurts them. All that testing forces schools to buy cheap tests, because they have to administer so many of them.  Cheap tests measure low-level basic skills, not the kind of high-level, complex skills most employers are looking for these days.  Though students in wealthy communities are forced to take these tests, no one in those communities pays much attention to them.  They expect much more from their students. It is the schools serving poor and minority students that feed the students an endless diet of drill and practice keyed to these low-level tests.  The teachers are feeding these kids a dumbed down curriculum to match the dumbed down tests, a dumbed down curriculum the kids in the wealthier communities do not get.

This paragraph doesn’t have links, probably because it’s not well supported by the existing evidence. Certainly you hear this argument all the time, and I believe it may well be true that schools serving poor kids have worse curricula or more perverse responses to tests (even some of my own work suggests different kinds of instructional responses in different kinds of schools). But even if we grant that this impact is real, the literature on achievement effects certainly does not suggest harm. And the fact that graduation rates are skyrocketing certainly does not suggest harm. If he’s going to claim harm, he has to provide clear, compelling evidence of harm. This ain’t it. And finally here, a small point. I hate when people say schools are “forced” to do anything. States, districts, and schools were not forced to buy bad tests before. They have priorities, and they have prioritized cheap and fast. That’s a choice, not a matter of force.

5) Next, Tucker claims:

Second, the teachers in the schools serving mainly poor and minority kids have figured out that, from an accountability standpoint, it does them no good to focus on the kids who are likely to pass the tests, because the school will get no credit for it. At the same time, it does them no good to focus on the kids who are not likely to pass no matter what the teacher does, because the school will get no credit for that either. As a result, the faculty has a big incentive to focus mainly on the kids who are just below the pass point, leaving the others to twist in the wind.

I am certainly familiar with the literature cited here, and I don’t dispute any of it. Quite the contrary, I acknowledge the conclusion that the students who are targeted by the accountability system see the greatest gains. This has been shown in many well-designed studies, such as here, here, here, and here. But this an argument about accountability policy design, not about annual testing. It simply speaks to the need for better accountability policies. For instance, suppose we thought the “bubble kids” problem was a bad one that needed solving. We could solve it tomorrow–simply create a system where all that matters is growth. Voila, no bubble kids! Of course there would be tradeoffs to that decision, so probably some mixture is better.

6) Then Tucker moves on to discuss the teaching force:

Not only is it true that annual accountability testing does not improve the performance of poor and minority students, as I just explained, but it is also true that annual accountability testing is making a major contribution to the destruction of the quality of our teaching force.

There’s no evidence for this. I know of not a single study that suggests that there is even a descriptive decrease in the quality of our teaching force in recent years. Certainly not one with a causal design of any kind that implicates annual accountability testing. And there is recent evidence that suggests improvements in the quality of the workforce, at least in certain areas such as New York and Washington.

7) Next, he takes on the distribution of teacher quality:

One of the most important features of these accountability systems is that they operate in such a way as to make teachers of poor and minority students most vulnerable.  And the result of that is that more and more capable teachers are much less likely to teach in schools serving poor and minority students.

It is absolutely true that the lowest quality teachers are disproportionately likely to serve the most disadvantaged students. But I know of not a single piece of evidence that this is caused by (or even made worse by) annual testing and accountability policies. My hunch is that this has always been true, but that’s just a hunch. If Tucker has evidence, he should provide it.

8) The final point is one that hits close to home:

Applications to our schools of education are plummeting and deans of education are reporting that one of the reasons is that high school graduates who have alternatives are not selecting teaching because it looks like a battleground, a battleground created by the heavy-handed accountability systems promoted by the U.S. Department of Education and sustained by annual accountability testing.

As someone employed at a school of education, I can say the first clause here is completely true. And we’re quite worried about it. But again, I know of not a single piece of even descriptive evidence that suggests this is due to annual accountability testing. Annual accountability testing has been around for well over a decade. Why would the impact be happening right now?

I think these are the main arguments in Tucker’s piece, and I have provided evidence or argumentation here that suggests that not one of them is supported by the best academic research that exists today. Perhaps the strongest argument of the eight is the second one, but again I know of no quality research that attributes our relative stagnation on 12th grade NAEP to annual accountability testing. That does not mean Tucker is wrong. But it does mean that he is the one who should bear the burden of providing evidence to support his positions, not Haycock and Edelman. I don’t believe he can produce such evidence, because I don’t believe it exists.

[1] I think it’s almost universally a bad idea to tell civil rights leaders what to do.

Speedometry is so hot right now.

While I talk about FOIAs and textbooks all the time, I occasionally do a little bit of other research as well. Easily the most fun project I do is the Speedometry project, funded by the Mattel Children’s Foundation.

It all started a couple years ago when a USC faculty colleague of mine was chatting with her friend, who happened to work for the Foundation. They were talking about their elementary-aged boys just love Hot Wheels, and wouldn’t it be great if they could bring that love into the classroom. Two years and two grants later, and here we are.

This project has been a blast, right from the beginning. In the first year, we brought together practicing fourth grade and kindergarten teachers and worked with them to develop a couple of weeks of instruction at each grade that used Hot Wheels cars to teach kids science and math content. We then tried out our lessons in 17 classrooms scattered across a few SoCal districts (half treatment, half business-as-usual). Overall we found promising results. Kids seemed to be learning science and math, they were excited about doing science and math, and the teachers were happy and learning too. Based on the promising pilot, we got additional funds this year to roll out the units more fully, do a larger-scale study (involving random assignment of about 60 classrooms in one Southern California district), and visit key U.S. cities to spread the message about Speedometry. The data collection for the study wrapped a few months ago, and we’ll soon be getting into the analysis phase. We should have good results to share by conference season in the spring.

The study has been really fun and a total diversion for me. There are many things the project has taught or reinforced to me (some pithy, some not):

1) Science is not a major part of the typical elementary curriculum. At most it gets a few hours a week in a typical classroom.

2) A corollary of #1: it’s often necessary (or at least beneficial) to integrate science with math and/or ELA in order to get it taught.

3) There’s *tremendous* variation in 1 and 2 across classrooms and districts.

4) Elementary teachers often are not especially strong in science, though there are of course exceptions.

5) Teachers in many districts haven’t received new materials in a long time. Even the simple offer of giving teachers the Speedometry kits is greeted as tremendous generosity.

6) Teachers are both nervous and excited about the idea of giving students a bit more control over lessons. Many of them claim they haven’t had the opportunity to relinquish control to students in quite a while, but that this is changing due to recent standards.

7) Virtually everyone I’ve encountered in schools is incredibly gracious for what we’ve done (which, to my mind, is not especially ambitious). Many teachers evidently feel quite unsupported.

8) Corporations and academics speak very different languages, though the general existence of dreadful buzzwords is quite ubiquitous in both areas.

8) Corporations often operate on very fast schedules that academics do not.

10) Teaching fourth grades is both fun and terrifying, especially relative to teaching doctoral students.

11) Dedicated PR firms can do amazing things. They seem to be very clearly be worth their cost (though I actually don’t know what the cost is).

I’m sure I’ll keep learning more. But for now, I’m hoping this work can serve as a model for other partnerships designed to improve education for kids. In the meantime, we’ll be crunching the data from our big experiment, and we’ll get back to you with results soon!

FOIA information

Textbooks seem to have suddenly become a hot topic again, with the recent release of a study of NYC schools’ textbook habits by Charles Sahm. Robert Pondiscio has a nice summary of Sahm’s important work. He also ends with a mention of my textbook FOIA work and some quotes from me about how silly it is that we don’t track these data routinely.

In that spirit, I thought it might be of interest if I gave a little update on where things stand with my research. Perhaps this explanation will explain my somewhat diminished presence here and on Twitter in the past two weeks.

I sent out 3,014 FOIA requests a bit before Memorial Day, and the responses started coming back on Tuesday, May 26. Since then, I’ve received:

  • Textbook adoption data reported on http://www.nsftextbookstudy.org for approximately 650 districts.
  • Email or snail mail responses for approximately 850 districts (some overlap with the above, but not much).

Of these 850 email/snail mail responses, I’d estimate about 10% say they don’t keep textbook data, about 50-60% provide the data, and about 30-40% say they need more time to collect it.

And of those who provide the data, it’s relatively clear that some substantial proportion of them–perhaps half–do not routinely keep a list of the data, but rather they pulled something together for me (by law, by the way, they do not have to do this, so I am quite appreciative). Some of these pulled-together data include handwritten lists in cursive.

Five districts have so far charged me for the information (ranging from $1.19 to $27), which they are legally entitled to do if it took them time to pull the documents together or if they made copies.

One school originally demanded that I come pick the data up in Rochester, New York. However, after a somewhat testy email from me, the principal finally offered to email me a PDF for $2.25.

Two district leaders have sent threatening emails or left nasty messages indicating they would rat me out to my Dean (for what, I don’t know). I told me Dean and she said no big deal.

Perhaps 10 respondents have expressed great interest in the research or sent thoughtful notes.

It’s very clear that New York districts track these data less routinely than Illinois or Texas districts. However, my sense is that the response rate is quite a bit higher in New York than in the other two districts, so perhaps it is just that the districts that don’t track the data in Illinois and Texas are ignoring me, whereas in New York they’re telling me they don’t track it.

So that’s where we are at this point. We have data of some kind from ~40% of districts in these four states. I haven’t really gotten beyond that to look at what they’re actually reporting yet. We’ll start with follow-up emails and letters to the nonrespondents in a few weeks.

All in all, I’ve been amazed at how unbelievably effective this has been as a research strategy, even if I feel somewhat bad for having had to deploy it. I’m very excited for the data gathered to this point, and I think it will be useful both to me and to other researchers moving forward.

Tests are the worst! Or the best! No, the worst!

A new Quinnipiac poll is out today. As always, I think it’s best to take these polls not as single data points in favor of one particular position, but rather as part of a broad sea of often contradictory, incoherent evidence about what/whether the public thinks about education.

There are some interesting nuggets in here, and again fodder for both “sides” of current education reform debates. The teachers’ unions and their supporters will love that the poll finds voters support the teachers’ unions’ policies over Governor Cuomo’s by a substantial margin (note that Cuomo’s overall favorability rating is net positive, so the lack of support for his education policies is particularly strong; that said, I wonder how much people understand what his education policies even are). The reformsters will love that a majority thinks the number of charter schools in the state should be expanded. Nothing new here; support for charters in polls is almost always net positive.

What’s most interesting to me, though is a series of questions about standardized testing. To me, these questions make painfully apparent the utter lack of coherence (or, to put it much more charitably, the nuance) in the public’s views of testing. First, we have the question “Do you think teacher pay should or should not be based on how well their students perform on standardized tests?” The results here are a resounding NO, 69/28. Similar results for whether standardized tests should be used for teacher tenure. [1]

Then we have the question “How much should these tests count in a teacher evaluation: 100%, 75%, 50%, 25%, or not at all?” Now, you would imagine these results would be mostly “not at all,” since the very previous two questions folks said the results shouldn’t be used for pay or tenure. Nope! In fact, 49% of people say these tests should count 50% or more in teacher evaluation, and another 27% say 25%. Just one-fifth of respondents–21%–say not at all. Hardly an anti-test bunch, these voters.

And finally, we have the question “Do you think standardized tests are or are not an accurate way to measure how well students are learning?” At this point I guess you’d have to think that voters would say yes, since in the immediately preceding question 77% said these tests should count for teacher evaluation. But you’d be wrong again! 64% said that standardized tests were NOT an accurate way to measure how well students are learning.

So, tests are not an accurate way to measure student learning, but they should definitely count at least a quarter in teacher evaluations, but they shouldn’t count at all in tenure or pay decisions. Got it. Suffice it to say this is yet another example showing why it’s immensely problematic when people pick a single data point from one poll and use it in support of their existing position.

[1] Jacob Mishook, on Twitter, notes that these wordings could be construed to imply 100% reliance on standardized tests for these decisions, which is a fair point that might explain at least part of the very negative response.