Final exam week just wrapped up at our school. Let’s consider an honors-level Geometry multiple choice question:

*Find the area of an equilateral triangle with a side length of 14 inches. Leave in simplest radical form if necessary.* (The whole test is with a calculator. Choices below with her answer circled.)

Here’s the student’s work:

**What the student does correctly:** draws the triangle, splits it in half, uses the Pythagorean Theorem to find the height, and calculates the area of the triangle to the nearest hundredth

**What the student does incorrectly:** claims that 196 – 49 = 47 instead of 147 (maybe she wrote over the 1 with her square root symbol?)

To be fair, this question can be done using special right triangles (or this formula). If she *did* do the subtraction right, she would have just checked her answer against the radical answers. Hey – she makes an error – but does she deserve **0%** of the credit on that question? It’s not like you get any extra credit for getting a multiple choice question right, so why do you lose all credit for missing a small part? Giving a grade based on multiple choice questions disproportionately penalizes students who might understand all of the concepts, but make a handful of small mistakes.

It’s a basic, but frustrating example. At my high school, our math midterm and final exams are often split pretty evenly between multiple choice and open-ended questions. On multiple choice questions, there’s no partial credit (urgh) and on the open-ended we use a key with points assigned pretty specifically (not the best system either).

There are many reasons multiple choice questions are unfair to students and don’t do a great job of measuring how much they know, despite the College Board’s defense of them. The history of multiple choice is worth looking into and while the format *does *have some merits, it is overused and often doesn’t reflect a student’s true ability.

Regardless of any debate on multiple choice as a question format, I guess I am raising this question: **Is scoring multiple choice as or all nothing fair?**

william.cipolli

There’s a lot of statistical theory behind multiple choice questions, and believe it or not thousands and thousands of dollars go into writing these exams. The area of statistics is call Item Response Theory and the idea behind it is that we’re trying to measure something by a proxy – in this case mathematical knowledge of triangles by looking at answers to multiple choice questions.

Multiple choice questions in massive standardized testing endures much testing to make sure we ask the questions that give us the most information about the students knowledge or ability and there are ways to mollify the affect of guessing and entry error, believe it or not – but I’m not certain who uses those.

Even though there’s a lot of solid statistics behind the curtain, we’re still upset when we see there isn’t something magical that makes multiple choice the best choice for evaluating these latent variables. It turns out that even though we spend a lot writing these multiple choice exams they really aren’t good for anyone but policy makers and teachers; it’s a quick and dirty way of “evaluating” students at a lower cost than actually grading the problem.

Something we should all think about is that the AP exams mix free response questions with multiple choice, but there is a massive cost associated with grading the free response – they invite many educators to fly to a central location, be trained on the grading process, and get all the papers done that week.

Even in the free-response case, we aren’t free. Often the compliant with human graders is that there’s too much grader-to-grader variability. Obviously in math these things are more clear cut, but expert human graders have been found to achieve exact agreement on only 53% to 81% of all essays. Certainly, in the human grading case it’s just as difficult to argue that our measurement is reasonable and fair by comparison.

I think with the type of students we’re seeing today we really might end up seeing a more tangible system where deep, long projects will be the cap of their education that they can show as a resume – their motivation and ability to do is what will keep them going, some of them anyway.

Here’s a good summary of item response: http://erm.uncg.edu/oaers/methodology-resources/item-response-theory/