How should we interpret the Common Core State Standards cut scores?
In 2015, prior to the release of test scores from the new Smarter Balanced (SBAC) assessments, then-California Superintendent of Public Instruction Tom Torlakson tried to prepare the public by warning that scores were likely to be lower than scores on the former assessments, because the new standards were higher. He was right on both counts—the standards are higher, and the scores were lower. In a baffling case of standing logic on its head, somebody, somewhere, decided that if too few students were meeting the former standards, then the best solution, by golly, was to raise the standards even higher.
Despite the Superintendent’s warning, the media dutifully reported the scores without seriously questioning if—just maybe—the standards had been raised too much. The coverage by EdSource was typical, with a headline reading, “Most California students below standards on Common Core-aligned tests.” Six paragraphs into the story EdSource acknowledged Superintendent Torlakson’s statement that the standards are more rigorous than previous tests, but then cast doubt on that claim by stating, “Test results from 2003, the baseline year for students taking the STAR tests under the 1997 California academic standards, don’t appear to support Torlakson’s argument that the current tests are harder, however. More students met or exceeded the English language arts test this year than were proficient or advanced in 2003: 44 percent vs. 35 percent (emphasis added).” The potentially good news that students actually got smarter was, instead, presented as evidence that the tests got easier!
Similarly, the Sacramento Beeeditorialized, “Only 33 percent of California students met or exceeded the new math standards, and only 44 percent met or exceeded the standards in English. Yikes.” To be sure, the Bee acknowledged that “These numbers are a baseline, and a lot of those ‘below standard’ scores are probably closer than they seem to the goal.” But still, even while admitting that the Common Core is a “major upgrade” from previous standards, the purpose, validity, or meaning of that upgrade were not questioned.
That lack of curiosity about what the standards actually are, who sets them, and how, continues to this day. I’ve never understood why the standards themselves receive such little—if any—scrutiny from the public, the media, and policy makers. It’s as if the standards have been handed down as some sort of received wisdom and accepted as an article of faith, beyond the reach of reasonable inquiry.
These thoughts came to my mind as I leafed through Cal Facts 2018, recently released by California’s Legislative Analyst’s Office (LAO), and found a graph with the heading, “fewer than half of California’s K-12 students meet state standards.” It shows that, in Grades 4, 6, 8, and 11, fewer than half (in some cases far fewer than half) of students met state standards in reading and math in the Spring 2018 SBAC assessment. The lone exception was Grade 11 reading, in which slightly more than half met the standards.
Now, among the first questions that should come to mind (but rarely, if ever, do) are: what are the standards, who sets them, and how? After all, on the face of it, there are two equally plausible ways to interpret this graph. Either California’s students (and by extension our schools) are failing miserably or the standards themselves are unreasonably high. The second interpretation should be given at least as much consideration as the first. Instead, first interpretation gets all of the headlines.
California uses four performance levels to describe student performance on the SBAC assessments:
- Level 1: Standard Not Met
- Level 2: Standard Nearly Met
- Level 3: Standard Met
- Level 4: Standard Exceeded
Individual student scales scores are used to determine which performance level each student falls within. For example, in third grade math, a student whose scaled score is between 2436 and 2500 would be in Level 3. A score that divides one level from the next is a cut score. California uses the numbered levels to get away from the use of the terms below basic, basic, proficient, and advanced that were associated with the former assessments, but the idea is the same. In fact, in some other states that use the SBAC assessments, Level 3 is still referred to as “proficient.”
State standards are commonly thought of as grade level expectations—what students are expected to know and do at each grade level. Originally, grade level expectations were based on average student performance (based on norm-referenced tests), which subsequently became the expected level of performance for all students, so that all students were expected to achieve at or above average. While it’s good practice to have high expectations for each individual student, it makes no sense to expect all students within a school, district, or state to achieve above average.
Most states, including California, have adopted the Common Core State Standards (CCSS), which were developed by a consortium of states through the National Governors Association (NGA) and the Council of Chief State School Officers (CCSSO). Among other criteria, the standards at each grade level were developed to be “rigorous,” which means, according to the SBAC consortium, they include “high-level cognitive demands by asking students to demonstrate deep conceptual understanding through the application of content knowledge and skills to new situations.”
The Smarter Balanced (SBAC) assessments and performance expectations are tied to those standards. So, to say that 48% of 4thgraders meet state standards in reading and language arts, is to say that 48% of students have mastered a rigorous course of study and demonstrate a deep conceptual understanding of content knowledge and skills. We used to refer to them as “A” students. Now they’re just Level 3: Standard Met, or, in the prior parlance, “proficient.”
The setting of standards and the setting of cut scores to distinguish between achievement levels is an art, more than a science, and it is done away from the public eye. The California Alliance of Researchers for Equity in Education (CARE-ED), a collaboration of more than 100 California education researchers) argues that the SBAC (along with PARCC, the other CCSS-aligned assessment) lack “basic principles of sound science, such as construct validity, research-based cut scores, computer adaptability, inter-rater reliability, and most basic of all, independent verification of validity (http://care-ed.org).” CARE-ED also reports that, “when asked for documentation of the validity of the CA tests, the CA Department of Education failed to make such documentation public.” (By the way, the SBAC consortium invited non-educators like members of the general public and the business community, who had no pedagogical background at all, to participate in standard setting!)
What the standards lack in scientific rigor, they make up for in subjectivity. Standards assessed on the SBAC are expressed in the form of “claims,” which are summary statements about the knowledge and skills students are expected to demonstrate on the assessment related to a particular aspect of the standards.” Here, for example, are the claims for English/language arts for grades 3 through 8:
- Overall claim:“Students can demonstrate progress toward college and career readiness in English language arts and literacy.”
- Reading:“Students can read closely and analytically to comprehend “Students can read closely and analytically to comprehend a range of increasingly complex literary and informational texts.”
- Writing:“Students can produce effective and well-grounded writing for a range of purposes and audiences.”
- Speaking and Listening:“Students can employ effective speaking and listening skills for a range of purposes and audiences.”
- Research/inquiry:“Students can engage in research and inquiry to investigate topics, and to analyze, integrate, and present information.”
All of these things are important, to be sure, but the question of how these objectives get translated into measurable student outcomes at each grade level is not easy (or perhaps even possible) to answer. Let’s look at just one claim at the 3rdgrade level: Students can read closely and analytically to comprehend a range of increasingly complex literary and informational texts. What does it mean to read “closely” and “analytically?” How close is close enough? How is that measured? What degree of close and analytical reading distinguishes between Levels 1, 2, 3, and 4? What level of complexity should a 3rdgrader comprehend in order to be on a college- and career-ready track? Where is the evidence to support any specified level? And is it even possible to define in measurable terms “a range of increasingly complex literary and informational texts?” Who can even say what a “range” is or how broad or extensive should be the abilities it encompasses?
Before we really know how to interpret a finding like “fewer than half of California’s K-12 students meet state standards,” we need to know what level of knowledge and performance these standards denote, and we don’t. And we need to be public and explicit about whether a standard represents a minimum level of achievement reasonably expected of all students or an aspirational level. By its own admission, SBAC has chosen to set the bar at a “rigorous” level. That’s fine, but SBAC and the CDE and SBE should be open with the public about the level of rigor that the cut scores represent.
This is not an argument for watering down standards or dumbing down expectations, but it is an argument for being open and explicit about what level of knowledge and understanding we want standards to denote. Currently, the public sees them as minimum, while they are really designed to be aspirational. This disconnect leads to a misinterpretation of test results.
That the standards are nebulous is beyond doubt. And because they are nebulous, achievement of them cannot be measured with any precision. Yet, results are presented as if the difference between a scaled score of 2435 and 2436 in grade 3 math determines whether a student is performing at Level 2 (Basic), or Level 3 (Proficient).
But the problems of measurement, as serious as they are, should take a back seat to the fundamental question of whether the cut scores (however they are measured) have been reasonably set. As James Harvey, Executive Director of the National Superintendents Roundtable has said, “No matter how well-meaning advocates are, when they push for school improvement on the grounds of highly questionable assessment benchmarks, they aren’t strengthening schools and building a better America. By painting a false picture of student achievement, they compromise public confidence in the nation’s schools, in the process undermining education and weakening the United States” (Educational Leadership, February 2018). This sentiment is echoed by Gary Orfield of the Civil Rights Project at UCLA: “Setting absurd standards and then announcing massive failures has undermined public support for public schools….We are dismantling public school systems whose problems are basically the problems of racial and economic polarization, segregation and economic disinvestment.”
Maybe the standards are absurd or maybe they aren’t. My argument is that we don’t know. And until we know, we cannot make informed judgements about the performance of our schools and students. We have no basis for meaningfully interpreting a finding that fewer than half of California’s students meet “standards” without knowing how demanding those standards are.
I understand that the purpose of Cal Facts 2018is to provide quick, easily digestible facts about a broad range of issues confronting California. But rather than being digestible, this one gave me heartburn. Sometimes presenting a fact without the necessary context can be more harmful than helpful.