Ms Grist vs. Carlos Beltran
February 12, 2008 10:44 AM
A couple of days ago Kevin Carey compared Bill James’ work in baseball statistics to value-added assessment while taking on Steve Koss. Kevin writes: “Moreover, Koss doesn't know what he's talking about. The NYC value-added measures are not "derived from a single variable," they're exactly the kind of complicated multi-variate measure he describes.” He then goes on to describe the complexity of the New York value added model.
“The NYC model uses something like 12 discrete variables, and the HLM version of value-added pioneered by Bill Sanders is so complicated that you need a PhD in statistics and a special computer at SAS headquarters to run it. It's more complicated that anything Bill James does, as it should be.” In both baseball and education the goal is to try to use statistical evidence to gain insight about the performance of someone like Mets centerfielder Carlos Beltran, or my 10th grade world history teacher Ms. Grist, (both perennial all stars). For Beltran, we can look at how his actions directly affect the game, and to do that we can then control for how the environment affects that performance. How does he do with runners on? How does he do on the road? Vs righty or lefty pitching? In different counts (i.e. one strike, two strikes) Etc etc. Ditto Ms. Grist, using all the variables Kevin describes in his post. So far, Kevin’s logic holds. The difference though is very important. For Ms. Grist we’re really looking at one dependent variable: standardized test scores. We may be parsing it with 144 discrete independent variables. It’s still one dependent variable. And it has limits. Tests don’t necessarily match up to what teachers are supposed to teach. And there is more to teaching than test scores. Tests are kind of like runs batted in in hitting or strikeouts in pitching. They tell a lot of the story, but not all of it. For Beltran, we have a range of outcome measures and are developing an ever more sophisticated understanding of what each one means. So to evaluate fielding, we look at not just fielding percentage, but also his ability to get to the ball. For hitting, we look at power, on base percentage, speed, ground ball and fly ball percentages, etc. And within these things we look at different outcomes as well. From this, we’re learning about what behaviors matter. That’s something we’re still in the baby steps of doing with testing, although I’m seeing signs of progress on this front. Steve Koss really has the better part of this argument to the extent that I'm wondering if Kevin simply misunderstood his post. I know we all pay lip service to the idea that test-based value-added systems are “just one more tool." But it is way too easy to give them credit for being things that they are not.



Comments
It's amazing how the "reformers" get all worked up about the details and ignore the rampant overcrowding, the crumbling facilities, the largest class sizes in the state, and the fact that some public school kids don't even have school buildings. Every day I enter my 250 plus% capacity building, I marvel at all the folks in think tanks who praise the "reforms."
As long as Mayor Mike fails to deliver neighborhood schools consistently good teachers, smaller classes, and decent facilities, he'll continue to achieve little or nothing. In fact, if you judge him by his own standards, NAEP scores, which he couldn't manipulate, suggest he's made no progress whatsoever.
Posted by: NYC Educator | February 13, 2008 02:01 PM