The Truth About Student Achievement: Part Two
February 2, 2007 07:15 AM
Posted by Dan at AFT
In my first post in this series, I examined the claim that student achievement is declining, a common refrain of public school critics. To counter this claim, I highlighted the impressive rise in test scores of 9-year-old African-Americans on the NAEP long-term trend assessment, widely considered the most reliable gauge of student progress. (Click here to learn about the major differences between the NAEP long-term trend assessment and the NAEP main assessment.)
But what about other ages and subgroups? Are they showing the same sharp rise in NAEP long-term trend scores? You can see the data for yourself here or read the highlights below. Remember: 10 points on NAEP equals about one year of learning.
Age 9
- Math (1973 to 2004): African-Americans are up 34 points, Latinos are up 28 points, and Whites are up 22 points.
- Reading (1971 to 2004): African-Americans are up 30 points, Latinos* are up 22 points, and Whites are up 12 points.
Age 13
- Math (1973 to 2004): African-Americans are up 34 points, Latinos are up 26 points, and Whites are up 14 points.
- Reading (1971 to 2004): African-Americans are up 22 points, Latinos* are up 10 points, and Whites are up 5 points.
Age 17**
- Math (1973 to 2004): African-Americans are up 15 points, Latinos are up 12 points, and Whites are up 3 points.
- Reading (1971 to 2004): African-Americans are up 25 points, Latinos* are up 12 points, and Whites are up 2 points
With the exception of the reading scores of 17-year-old Whites, all the gains are statistically significant. There are, of course, still sizable gaps in achievement between Whites and African-Americans, and Whites and Latinos, and we must continue to narrow them. But the facts are undeniable: Real and substantial progress has been made by all subgroups, particularly in math (a subject that is more easily influenced by in-school factors than reading) and in the earlier grades (when kids are more likely to take the tests seriously).
In my next post, we’ll take a look at why overall average NAEP scores (which public school critics like to seize on) mask the quite impressive gains made by student subgroups.
*In reading, NAEP didn’t treat Latinos as a separate category until 1975.
**Tests for this age group are notoriously unreliable so results should be interpreted with caution.



Comments
I suppose I could be considered a critic of public schools, though I am more of a reformist than a voucher advocate.
I have never thought or read that scores are declining. My major beef is that improvement has not been nearly as drastic as it could of been.
I also suspect that the major gains in education by historically disadvantaged sub-groups has been caused more by social reforms outside of school as opposed to the efforts of the schools.
Most of the gains you point out happened from 71 to 88. Since then on average, it appears that improvement has stagnated.
I would also like to add that one of the main arguments of charter school and voucher advocates has been that choice (or the threat of choice) will force public schools to do more to improve.
It seems to me that by pointing out these improvements along with the acknowledgement that schools can do better is proving their point.
Honestly if there wasn't such a big push for vouchers and charter schools, would public schools be as focused as they are on improvement?
Posted by: Rory @ parentalcation | February 3, 2007 05:18 PM
In 1994, Barron and Koretz at NCES (ED 404 368)concluded that NAEP trend estimates for blacks and Hispanics on NAEP were "overly error-prone" (p. 22) for a wide variety of technical reasons. They recommended that the trend NAEP be designed more like the main NAEP, but also noted that the changes could have "erratic effects on the trend lines" (abstract). Some of those changes were not put into place until the most recent trend study. Do we really want to rely on estimates that NCES deemed "overly error prone" (unreliable? useful for what purpose?) or accept the recent study without a careful comparison of the effects of changes in age-only and sample weighting?
Furthermore, do 10 scaled score points really represent a year's growth? Barron and Koretz say you can use this rule of thumb, but it could be "misleading" (p. 8). Looking at Reading from the most recent report (NCES 2005-464), it is easy to see how misleading this can be. These are the ranges of scaled scores from the 10th to 90th percentile for each group:
9 yr-olds: 169 to 264, Median=221, Range=96
13 yr-olds: 210 to 305, Median=260, Range=96
17 yr-olds: 227 to 338, Median=287, Range=112
Are we then to believe that each test actually reliably measures performance over more than 9 or 10 years' worth of growth? If the
median represents something at "grade level," how does the 17-yr-old form reliably plot more than 51 scaled score points (or 5 years' worth of growth) above a Jr. or Sr. level in high school (an age equivalent of 22)? Can the 9 year-old form with more than 5 years of growth below the median really tell us something about students who have had less than the equivalent of kindergarten (age 4)? Of course not. And this doesn't even include confidence bands for these scores which can, themselves, be multiple "years" wide. But if we are willing to talk about these scales as if 10 points is a "year's worth of growth," who can complain when the Secretary of Education calls proficient "on grade level"?
NAEP is indeed the most well-known and reliable measure of long-term trends. Over the years, there have been changes in type of items,
form length, IRT model, sampling methodology, inclusiveness, etc. --
factors any psychometrician would tell you would be difficult or impossible to overcome in terms of maintaining some consistency of meaning of a scale. But a job is a job. A serious discussion of the endless technical challenges, anomalies and adjustments, documented by everyone from NCES to the GAO, could lead political leaders to believe that achievement tests, even the best, do not provide a definitive truth. While you might want to collect the data as one measure to consider, you certainly wouldn't want to base an accountability system on just, say, counts of how many students were "proficient."
PS. Why is it when you ask the NAEP statisticians about the reliability and interpretation of NAEP scores, you get a different answer than if you ask the NAGB public information officer? (http://www.ewa.org/desktopdefault.aspx?page_id=120&news_id=2758).
Posted by: Anonymous | February 4, 2007 08:37 PM