Thursday, March 8, 2007

Anatomy and Evolution of a Difficult Math Problem

Note: The problem below was chosen for the March 23rd Carnival of Mathematics. If you'd like to see a different kind of mathematical challenge, I recommend you also visit the posting for 3-30-07 (Challenging Geometry: Circles Inscribed in Quadrilaterals, Right Triangles).

Ever wonder how much thought goes into developing a challenging math question for a standardized test like SATs? The goal on most standardized tests of reasoning is to have a small percentage of questions (usually at the end of a section) that are more discriminating. A question accomplishes this goal if less than, say, one-third of the test-takers answer the item correctly (testing and measurement experts may have a different percent in mind here). The question is first field-tested and if the p-value (% who get it right) is too low, the question may be rejected.

Let's look at how such a challenge problem may evolve.

First Version:
322 has two identical digits. What is the next larger 3-digit number of this type?

Comments: Way too easy since the answer is the next integer, 323 and even students who are not thinking will probably not choose 333 which has 3 identical digits (333 could be 'of this type' since the wording was vague). Some may choose 344 however. Does this question have promise? Can we improve the wording and ratchet it up?

Second version:
6,333 has exactly 3 identical digits. What is the next larger such integer?

Comments: Better? We added 'exactly' to make it clear that it couldn't have more than 3 identical digits, an important distinction. I dropped the 'number of digits' info since it wasn't needed. Will many students choose 6,444 rather than 6,366? Probably, so should we stop here and field test it?

Third version:
81,111 has exactly four identical digits. What is the next larger such integer?

Comments: This should be slightly more discriminating than the previous because the string of ones is more seductive and will probably lead many to be lured into 82,222, rather than the correct answer of 81,888. The stronger student will be suspicious here, particularly if this question is placed near the end of a section. Further, this question cannot appear on the SAT since one cannot grid in a 5-digit answer (4 is the maximum). So we need one more version...

Fourth and final version:
96,666 has exactly four identical digits. If N represents the next larger such integer, what is the value of N - 96,666?

Comments: Ok, now the answer 'fits' in the grid. Experienced item writers and test constructors know that a majority of random test takers will grid in 1111 as the answer, particularly if this is near the end of a section and time is a factor. The student who has stronger reasoning AND can think under pressure will probably realize that N could be 96,999 so the correct answer is 333. Sorry to give it away, but today's post is about constructing a more difficult problem rather than challenging you! BTW, a question very similar to this has appeared on the SATs and in some SAT prep books.

So what's the point of all this and how do I or any item writer know what makes a question harder? EXPERIENCE! (I almost yelled 'TRADITION' since I watched my 13 year old perform in 'Fiddler' last week!). I've given this question or a similar version to SAT prep groups for some time now. In fact, I did it this past Saturday. The results? First group, out of 23 students, 3 answered it correctly. Second group, out of 11, ONE answered it correctly. Since the percentages were considerably less than 20%, would this question ever make it out of field testing? Well, I believe it did! From my own experience, less than 15% of students answer it correctly and my sample space consists of above-average students, in fact, fairly strong students.

How many of you are thinking that this question is unnecessarily 'tricky' and really has no place on a math test? What important math skill or concept is it assessing? Well, some questions are assessing reasoning ability and this is one of them. Does this question fairly discriminate? If the highest-scoring students got it wrong, then it DOES NOT discriminate! However, that's not what happened. 'K' was the only student who answered it correctly in my second group and I predicted she would. She has exceptional reasoning aiblity. She gave me a look of "This wasn't that hard, Mr. Marain, why are you complimenting me so much!" For her it wasn't that hard and that's the point. Is there a place for these kinds of questions on assessments when most students have little exposure to these. Well, the test is not intended to be just a reflection of homework problems from a textbook. This is precisely why many educators and leaders have challenged the SATs over the past few years and why the test was changed a couple of years ago. But there will always be a couple of these...

I'm sure many of you have strong opinions about the educational validity of this question, particularly for a standardized test. If nothing else, try it out in your classes (or give it to your spouse, child, colleague or co-worker) and report the results. If over 50% of the group answer it correctly (give them 30 seconds), I would guess you have one special group there! So, folks, does this much thinking really go into writing one little old test question!