I slide headphones over my ears, snap another cassette into the player, and press “Play.” There’s the lo-fi hiss of empty tape, a male announcer: “The practice time for the second melody is over. Release the pause button and begin your performance… now.” Then a girl sings to me softly, “La, la la la…” and I’m putting checks and Xes in a “tick sheet” to keep track of what I hear. She finishes with time to spare and is catching her breath when the announcer breaks in again: “The performance time for the second melody is over.” I count up the checks and fill in the bubble for “7” on the scoring sheet, while my other hand ejects the tape and reaches for another.
A dozen teachers and professors in motley headphones (veterans know to bring their own: the standard-issues are junk) sit in a windowless room and grade one of the “Free Response” questions from the Advanced Placement (AP) exam in Music Theory. It’s a sight-singing question: students have 75 seconds to study a notated melody and then thirty seconds to perform it. It’s not an easy melody. This year, the average score is just under four points out of nine. Eights and nines are oases of euphony in a desert of discord.
In seven days, I listen to two thousand attempts at this melody. Many students stumble into the same mistakes; inspired by a shared musical culture, they’re creatively, consistently wrong. In my hotel bed solitude, these recurring corruptions haunt me—minor becomes major, that leap is too big. In the morning, I’m back in that room with a fresh stack of tapes.
●
The AP occupies five blocks of Cincinnati for the week, taking over the Duke Energy Convention Center (a glass monolith named for the regional electric concern) and three chain hotels. These blocks are in the city’s “revitalized” post-industrial downtown: conspicuously new restaurants and hip bars juxtaposed with trash-strewn lots and deserted side streets. My flight lands late on a Saturday night. Along the moving walkway to the baggage claim, large photos showcase the sights of the region, “Courtesy of the Cincinnati Enquirer,” but the passageway is too long, or the list of sights too short, and some of the photos repeat themselves. The driver of our shuttle van has a smoke-stained gray moustache and an ID lanyard that reads “POW-MIA YOU ARE NOT FORGOTTEN” over and over. The countryside around the airport is black.
Aside from the hundred-odd Music Theory readers, Cincinnati hosts over a thousand readers of Spanish, plus a handful of French, German and Italian, whose conversations lend the AP zone an international flair. Some of these readers may be the only people in their school or town who speak that language. For them, grading week must feel like a homecoming. Much the same is true for us music theorists, since even fellow musicians have limited patience for our arcana.
The sight-singing reading room is run by an eminently competent Question Leader and two long-suffering Table Leaders, one of whom, a high-school teacher from Kansas, is a real AP veteran. One day she wears a red T-shirt with white block lettering: “I AM WOMAN / I AM STRONG / I AM INVINCIBLE / I AM POOPED”—she’s that kind of lady. Before the rest of us arrive, these three listen to a sample of recordings in order to find and fill the gaps between our question’s prewritten grading rubric and students’ boundlessly inventive mistakes. Then we spend the first day listening together, practicing with the rubric and sharing our hypothetical grades. There are fierce, technical debates about one or two points. Factions form, dissolve, reform around the next student. The goal is simple: get all the readers within a point of the leaders.
We don’t grade on voice quality or timbre. There are dulcet threes and “dirty nines,” the latter featuring lots of the “right note sung badly,” as opposed to the wrong note. We’re exhorted not to “get in their heads,” but in our effort to divide nature’s infinite octave into twelve equal compartments, how can we do otherwise? In a minor miracle, within a few hours, we’re all reaching the same score, more or less, without discussion.
●
This year, about nineteen thousand students took the Music Theory AP. Across all subjects, nearly one in six high-schoolers took at least one AP exam. These numbers reflect the AP program’s half-century of growth—almost fivefold in the past twenty years alone. Conceived in the early Fifties as a tool to compete with the Soviet Union, the AP exam enabled top high school students to gain “advanced placement” into second-year college courses. But AP’s leaders quickly recognized that in order to expand, “advanced placement” wouldn’t be enough: the opportunity to earn college credits from high exam scores would be an essential marketing tool, not to students or teachers, who showed little interest in cheap credits at that time, but to administrators. No longer confined to prep academies, AP courses are now offered at most American high schools.
Since AP’s founding in the Fifties, college has gotten a lot more expensive. The promise of cheap college credits has enabled AP to secure federal and state funding to expand the program, subsidize poor students’ exam fees, and send teachers to “AP Summer Institutes,” where they learn how to teach new AP courses.1 To expand the AP’s market to younger students, CB has developed SpringBoard, a prefabricated math and English curriculum for grades 6–12. The SpringBoard package, which includes teachers’ editions, “consumable student editions,” and no shortage of tests and quizzes, has been adopted by nearly half of the hundred largest public school districts, including New York, Chicago and Los Angeles. Best of all, it aligns with the new national Common Core standards. (Striking coincidence: David Coleman, CB’s president since 2012, was a key architect of those standards.)
ETS and College Board are successful because they have delivered standardized, testable college preparation for all. Courses and exams that allow high-schoolers to earn college credit and skip introductory courses? Done. Rigorous pre-college curricula? Done. Double-time expansion, so that No Child is Left Behind, or, as Obama had it, so that Every Student Succeeds? Done. Where attempts at public-sector reform have settled into wearisome stalemate, the College Board and ETS have striven nimbly towards their assigned goal.
●
About a quarter of the recordings—nearly 2,500—are on good ol’ cassette tapes. They arrive from all over the country, collected into white folders of seven tapes apiece, anonymized and numbered alongside precoded grading sheets. We spend slow hours popping tapes in and out of folders and tape players, rewinding and fast-forwarding, and every seven tapes (one folder’s worth), returning the bubble sheet to the folder, pressing the folder on to the “out” stack and grabbing another from the “in.” For each individual tape, these movements are trivial, but the sheer volume of tapes magnifies any physical shortcuts we discover. (For example, no one bothers to close the tape player’s lid.) If you’ve ever stuffed envelopes or done some other repetitive, multi-step physical task, and you’re of a certain anal-retentive, competitive mindset, you’ll know what I mean. These are the situations for which early industrialists devised “time and motion” studies.
Once I’ve hit my stride, I can grade a student in a little under a minute. Performances of five or higher seldom take more than a single hearing. Ones are easy too. Twos, threes, and fours slow me down, make me rewind and rehear the shakiest moments. (I soon learn to leave the “play” button locked down and depress “rewind” but gently, so that the recording resumes on release, a technique I privately call a “hot rewind.”) We almost never give zeros: students can a one simply for holding a note, any note, at the end of their recording.
Periodically, Table Leaders re-grade a sample of our recordings and let us know how we did, a process known in the business as “back-reading.” Back-reads are measured in “exacts,” “adjacents,” and “discrepants,” two or more points away from the TL, which necessitate a discussion and “override.” I was back-read a total of 54 times, and I had two “discrepants,” a rate which appears to be typical (anecdotally, at least—ETS doesn’t reveal these data). This doesn’t sound so bad, until you realize that most of my grades were not back-read or checked by anyone, so conservatively, around fifty of my final, uncorrected grades were “discrepant.” Beyond the back-reads, there are additional hidden checks on readers’ pace and reliability. Several upper-level people confirmed the existence of these checks, but wouldn’t reveal particulars. Whatever their precise nature, one thing is certain: we graders are being graded.
As lowly subcontractors, we readers can only determine students’ “raw scores.” ETS translates raw scores to AP scores, one to five, in a supremely confidential process understood in detail only by ETS psychometricians. The two ETS Assessment Specialists assigned to our test are diligently tight-lipped about anything AP-related that’s not publicly available, and cautious even about parroting the official line, lest they slip off message. I imagine they’d hold up well under interrogation. (Midway through the week, when I started asking questions related to this article, I was even called into “Strategic Workforce Solutions,” ETS doublespeak for HR, and reproved by a “Strategic Advisor.”) These opacities make the reading experience a Kafkaesque cocktail of comedic alienation. But unlike Kafka’s castle or court, the process achieves a concrete goal, with a lean competence unlike anything else in the industry.
●
I started on the piano at six. I could tell many just-so stories to explain why I kept playing after most children had quit. My parents encouraged me (or forced me), I had an inspiring teacher (or a strict one), I have a musical family (I don’t, actually), I’m good at math (a personal favorite). Each of these stories would reduce musicality to some single factor. But like any single thing, musicality depends on every other thing, at once. And the latitude I had to pursue music as seriously as I did (all the way to a doctoral program, for Pete’s sake) reflected privilege as much as talent or work ethic. Who else would have known such a vain, beautiful path existed, or felt free to follow it?
Nineteen thousand high schoolers, perhaps, most of them musicians with no particular interest in theory but every interest in another good AP score. A good score can help them get into a good college, which can help them get a good job. So they sight-sing eight measures of F minor (even now, eight months later, the phrase rings in my mind), and we give them a grade from one to nine. They’re musicians because music makes them “well-rounded.” Ten years from now, most of them will have stopped playing whatever instrument they play, exposed in retrospect as a different kind of instrument. But for now, music helps set them apart from millions of other applicants.
And a yawning gulf separates the best from the worst. I’d bet that the sevens, eights, and nines would do well in my college courses, while the ones, twos, and threes would struggle. A semester, or even a two-year curriculum, is never long enough to close this gap: the best students coming in are the best students going out. Would a lifetime be long enough? Maybe. But by the time these students take the AP, for most of them, it’s too late.
Some of them seem to realize this partway through their recordings. A boy with a wet voice hesitates, “Der…”, gamely tries to find his place, then collapses. A girl’s performance falls apart as she utters a defeated “Oh”—then silence. I mutely root for these students and I collapse right along with them. Meanwhile, other students don’t know or don’t care that they’re doing a terrible job, bulls in a musical china shop. These recordings make me angry at their teachers and schools for sending them into battle unarmed.
We might believe that these disparities depend on various inequalities and institutional prejudices. If only we could eliminate these imperfections, everyone would do well. Perhaps in the case of music, where the romantic image of the divinely inspired artist has not faded, we’re comfortable admitting that this is merely a socially useful fiction; but who dares deny that the seeds of success at math, history or other “academic” subjects lie within all of us, ready to sprout if only we could get the soil chemistry right?
●
On some days, right before lunch or closing time, we listen to a few standout recordings as a room and laugh together. A handful of boys (always boys), as a lark or out of fear of failure, do something braggably crazy: they sing a different song (“rickrolls” are not uncommon), recite a poem, and so on. These kids would thrill to learn of our appreciation. And then there are the oddballs: there is a girl whose rhythm is perfect, but who cannot seem to sing any downward intervals—by the end, her voice is in the stratosphere. I feel guilty laughing at these, but I do laugh.
Many such inside-jokes make it into Saturday’s amateurishly charming final-night revue, held in one of the larger rooms. Last year’s show was based on Mary Poppins; this year’s reprises a past theme, “AP Idol,” whose open-ended format works far better. Remember, we’re a bunch of music teachers: musically, the performances are never terrible and sometimes very good, but as comedy, it’s a potluck of ham and canned corn.
To catalyze our mirth, ETS installs a cash bar and bartender and endows us each with a drink ticket (a repurposed meal ticket), and the Chief Reader impishly sneaks a second ticket to those who wish. A small but conspicuous “cool” contingent does a lot of sotto voce complaining (mean-spirited, in my opinion) before, during and after the party. The rest of us are more willing to laugh in support if not hilarity, and we have a good time.
Over the next several years, ETS will roll out “distributed” grading of the music and language recordings. We’ll be just another set of remote workers, staring and clicking in our pajamas. The system delivers recordings to a computer in a relentless stream—no aides necessary. Gone are the stacks of tapes, physical trace of the ears’ and mind’s motionless labors; a counter on the lower left of the screen is their weightless digital surrogate. And gone are the physical inefficiencies whose obsessive discovery and elimination kept me motivated through the week. Instead, the task of these readers will resemble a session on Instagram or Tinder, adapted for the visually impaired: listen, judge, repeat.
I slide headphones over my ears, snap another cassette into the player, and press “Play.” There’s the lo-fi hiss of empty tape, a male announcer: “The practice time for the second melody is over. Release the pause button and begin your performance… now.” Then a girl sings to me softly, “La, la la la…” and I’m putting checks and Xes in a “tick sheet” to keep track of what I hear. She finishes with time to spare and is catching her breath when the announcer breaks in again: “The performance time for the second melody is over.” I count up the checks and fill in the bubble for “7” on the scoring sheet, while my other hand ejects the tape and reaches for another.
A dozen teachers and professors in motley headphones (veterans know to bring their own: the standard-issues are junk) sit in a windowless room and grade one of the “Free Response” questions from the Advanced Placement (AP) exam in Music Theory. It’s a sight-singing question: students have 75 seconds to study a notated melody and then thirty seconds to perform it. It’s not an easy melody. This year, the average score is just under four points out of nine. Eights and nines are oases of euphony in a desert of discord.
In seven days, I listen to two thousand attempts at this melody. Many students stumble into the same mistakes; inspired by a shared musical culture, they’re creatively, consistently wrong. In my hotel bed solitude, these recurring corruptions haunt me—minor becomes major, that leap is too big. In the morning, I’m back in that room with a fresh stack of tapes.
●
The AP occupies five blocks of Cincinnati for the week, taking over the Duke Energy Convention Center (a glass monolith named for the regional electric concern) and three chain hotels. These blocks are in the city’s “revitalized” post-industrial downtown: conspicuously new restaurants and hip bars juxtaposed with trash-strewn lots and deserted side streets. My flight lands late on a Saturday night. Along the moving walkway to the baggage claim, large photos showcase the sights of the region, “Courtesy of the Cincinnati Enquirer,” but the passageway is too long, or the list of sights too short, and some of the photos repeat themselves. The driver of our shuttle van has a smoke-stained gray moustache and an ID lanyard that reads “POW-MIA YOU ARE NOT FORGOTTEN” over and over. The countryside around the airport is black.
Aside from the hundred-odd Music Theory readers, Cincinnati hosts over a thousand readers of Spanish, plus a handful of French, German and Italian, whose conversations lend the AP zone an international flair. Some of these readers may be the only people in their school or town who speak that language. For them, grading week must feel like a homecoming. Much the same is true for us music theorists, since even fellow musicians have limited patience for our arcana.
The sight-singing reading room is run by an eminently competent Question Leader and two long-suffering Table Leaders, one of whom, a high-school teacher from Kansas, is a real AP veteran. One day she wears a red T-shirt with white block lettering: “I AM WOMAN / I AM STRONG / I AM INVINCIBLE / I AM POOPED”—she’s that kind of lady. Before the rest of us arrive, these three listen to a sample of recordings in order to find and fill the gaps between our question’s prewritten grading rubric and students’ boundlessly inventive mistakes. Then we spend the first day listening together, practicing with the rubric and sharing our hypothetical grades. There are fierce, technical debates about one or two points. Factions form, dissolve, reform around the next student. The goal is simple: get all the readers within a point of the leaders.
We don’t grade on voice quality or timbre. There are dulcet threes and “dirty nines,” the latter featuring lots of the “right note sung badly,” as opposed to the wrong note. We’re exhorted not to “get in their heads,” but in our effort to divide nature’s infinite octave into twelve equal compartments, how can we do otherwise? In a minor miracle, within a few hours, we’re all reaching the same score, more or less, without discussion.
●
This year, about nineteen thousand students took the Music Theory AP. Across all subjects, nearly one in six high-schoolers took at least one AP exam. These numbers reflect the AP program’s half-century of growth—almost fivefold in the past twenty years alone. Conceived in the early Fifties as a tool to compete with the Soviet Union, the AP exam enabled top high school students to gain “advanced placement” into second-year college courses. But AP’s leaders quickly recognized that in order to expand, “advanced placement” wouldn’t be enough: the opportunity to earn college credits from high exam scores would be an essential marketing tool, not to students or teachers, who showed little interest in cheap credits at that time, but to administrators. No longer confined to prep academies, AP courses are now offered at most American high schools.
Since AP’s founding in the Fifties, college has gotten a lot more expensive. The promise of cheap college credits has enabled AP to secure federal and state funding to expand the program, subsidize poor students’ exam fees, and send teachers to “AP Summer Institutes,” where they learn how to teach new AP courses.1And AP’s respective owner and administrator, the College Board (CB) and Educational Testing Service (ETS), are both 501(c)(3) non-profits, so tax relief provides further indirect support. Thanks in part to this government aid, in 2013-2014, CB cleared a profit of $135 million on the AP, exceeding the profits from its flagship SAT and PSAT. Together, CB and ETS earn about $2 billion in revenue per year. If they were a single company, they’d appear near the bottom of the Fortune 1000, alongside Wendy’s or Revlon. To expand the AP’s market to younger students, CB has developed SpringBoard, a prefabricated math and English curriculum for grades 6–12. The SpringBoard package, which includes teachers’ editions, “consumable student editions,” and no shortage of tests and quizzes, has been adopted by nearly half of the hundred largest public school districts, including New York, Chicago and Los Angeles. Best of all, it aligns with the new national Common Core standards. (Striking coincidence: David Coleman, CB’s president since 2012, was a key architect of those standards.)
ETS and College Board are successful because they have delivered standardized, testable college preparation for all. Courses and exams that allow high-schoolers to earn college credit and skip introductory courses? Done. Rigorous pre-college curricula? Done. Double-time expansion, so that No Child is Left Behind, or, as Obama had it, so that Every Student Succeeds? Done. Where attempts at public-sector reform have settled into wearisome stalemate, the College Board and ETS have striven nimbly towards their assigned goal.
●
About a quarter of the recordings—nearly 2,500—are on good ol’ cassette tapes. They arrive from all over the country, collected into white folders of seven tapes apiece, anonymized and numbered alongside precoded grading sheets. We spend slow hours popping tapes in and out of folders and tape players, rewinding and fast-forwarding, and every seven tapes (one folder’s worth), returning the bubble sheet to the folder, pressing the folder on to the “out” stack and grabbing another from the “in.” For each individual tape, these movements are trivial, but the sheer volume of tapes magnifies any physical shortcuts we discover. (For example, no one bothers to close the tape player’s lid.) If you’ve ever stuffed envelopes or done some other repetitive, multi-step physical task, and you’re of a certain anal-retentive, competitive mindset, you’ll know what I mean. These are the situations for which early industrialists devised “time and motion” studies.
Once I’ve hit my stride, I can grade a student in a little under a minute. Performances of five or higher seldom take more than a single hearing. Ones are easy too. Twos, threes, and fours slow me down, make me rewind and rehear the shakiest moments. (I soon learn to leave the “play” button locked down and depress “rewind” but gently, so that the recording resumes on release, a technique I privately call a “hot rewind.”) We almost never give zeros: students can a one simply for holding a note, any note, at the end of their recording.
Periodically, Table Leaders re-grade a sample of our recordings and let us know how we did, a process known in the business as “back-reading.” Back-reads are measured in “exacts,” “adjacents,” and “discrepants,” two or more points away from the TL, which necessitate a discussion and “override.” I was back-read a total of 54 times, and I had two “discrepants,” a rate which appears to be typical (anecdotally, at least—ETS doesn’t reveal these data). This doesn’t sound so bad, until you realize that most of my grades were not back-read or checked by anyone, so conservatively, around fifty of my final, uncorrected grades were “discrepant.” Beyond the back-reads, there are additional hidden checks on readers’ pace and reliability. Several upper-level people confirmed the existence of these checks, but wouldn’t reveal particulars. Whatever their precise nature, one thing is certain: we graders are being graded.
As lowly subcontractors, we readers can only determine students’ “raw scores.” ETS translates raw scores to AP scores, one to five, in a supremely confidential process understood in detail only by ETS psychometricians. The two ETS Assessment Specialists assigned to our test are diligently tight-lipped about anything AP-related that’s not publicly available, and cautious even about parroting the official line, lest they slip off message. I imagine they’d hold up well under interrogation. (Midway through the week, when I started asking questions related to this article, I was even called into “Strategic Workforce Solutions,” ETS doublespeak for HR, and reproved by a “Strategic Advisor.”) These opacities make the reading experience a Kafkaesque cocktail of comedic alienation. But unlike Kafka’s castle or court, the process achieves a concrete goal, with a lean competence unlike anything else in the industry.
●
I started on the piano at six. I could tell many just-so stories to explain why I kept playing after most children had quit. My parents encouraged me (or forced me), I had an inspiring teacher (or a strict one), I have a musical family (I don’t, actually), I’m good at math (a personal favorite). Each of these stories would reduce musicality to some single factor. But like any single thing, musicality depends on every other thing, at once. And the latitude I had to pursue music as seriously as I did (all the way to a doctoral program, for Pete’s sake) reflected privilege as much as talent or work ethic. Who else would have known such a vain, beautiful path existed, or felt free to follow it?
Nineteen thousand high schoolers, perhaps, most of them musicians with no particular interest in theory but every interest in another good AP score. A good score can help them get into a good college, which can help them get a good job. So they sight-sing eight measures of F minor (even now, eight months later, the phrase rings in my mind), and we give them a grade from one to nine. They’re musicians because music makes them “well-rounded.” Ten years from now, most of them will have stopped playing whatever instrument they play, exposed in retrospect as a different kind of instrument. But for now, music helps set them apart from millions of other applicants.
And a yawning gulf separates the best from the worst. I’d bet that the sevens, eights, and nines would do well in my college courses, while the ones, twos, and threes would struggle. A semester, or even a two-year curriculum, is never long enough to close this gap: the best students coming in are the best students going out. Would a lifetime be long enough? Maybe. But by the time these students take the AP, for most of them, it’s too late.
Some of them seem to realize this partway through their recordings. A boy with a wet voice hesitates, “Der…”, gamely tries to find his place, then collapses. A girl’s performance falls apart as she utters a defeated “Oh”—then silence. I mutely root for these students and I collapse right along with them. Meanwhile, other students don’t know or don’t care that they’re doing a terrible job, bulls in a musical china shop. These recordings make me angry at their teachers and schools for sending them into battle unarmed.
We might believe that these disparities depend on various inequalities and institutional prejudices. If only we could eliminate these imperfections, everyone would do well. Perhaps in the case of music, where the romantic image of the divinely inspired artist has not faded, we’re comfortable admitting that this is merely a socially useful fiction; but who dares deny that the seeds of success at math, history or other “academic” subjects lie within all of us, ready to sprout if only we could get the soil chemistry right?
●
On some days, right before lunch or closing time, we listen to a few standout recordings as a room and laugh together. A handful of boys (always boys), as a lark or out of fear of failure, do something braggably crazy: they sing a different song (“rickrolls” are not uncommon), recite a poem, and so on. These kids would thrill to learn of our appreciation. And then there are the oddballs: there is a girl whose rhythm is perfect, but who cannot seem to sing any downward intervals—by the end, her voice is in the stratosphere. I feel guilty laughing at these, but I do laugh.
Many such inside-jokes make it into Saturday’s amateurishly charming final-night revue, held in one of the larger rooms. Last year’s show was based on Mary Poppins; this year’s reprises a past theme, “AP Idol,” whose open-ended format works far better. Remember, we’re a bunch of music teachers: musically, the performances are never terrible and sometimes very good, but as comedy, it’s a potluck of ham and canned corn.
To catalyze our mirth, ETS installs a cash bar and bartender and endows us each with a drink ticket (a repurposed meal ticket), and the Chief Reader impishly sneaks a second ticket to those who wish. A small but conspicuous “cool” contingent does a lot of sotto voce complaining (mean-spirited, in my opinion) before, during and after the party. The rest of us are more willing to laugh in support if not hilarity, and we have a good time.
Over the next several years, ETS will roll out “distributed” grading of the music and language recordings. We’ll be just another set of remote workers, staring and clicking in our pajamas. The system delivers recordings to a computer in a relentless stream—no aides necessary. Gone are the stacks of tapes, physical trace of the ears’ and mind’s motionless labors; a counter on the lower left of the screen is their weightless digital surrogate. And gone are the physical inefficiencies whose obsessive discovery and elimination kept me motivated through the week. Instead, the task of these readers will resemble a session on Instagram or Tinder, adapted for the visually impaired: listen, judge, repeat.
If you liked this essay, you’ll love reading The Point in print.