Thursday, September 4, 2008

Achieve/ADP Algebra 2 End of Course Exam Report/Findings and MathNotations Commentary - Part I

Addendum: This commentary will shortly be followed by Part II which will focus on some of the following issues:
(a) Why does Achieve stress that the content is Advanced Algebra when it appears to be primarily standard Algebra II.
(b) Significant discrepancy in student performance between multiple choice vs. student-constructed and open-ended questions; implications for other standardized tests (do students do better or worse on student-constructed questions on SATs?)
(c) Do the results on this test suggest that Algebra 1 should have been the first such "standardized" test? In other words is the real issue here weaknesses in Algebra 1 background?

Note: Any facts or figures cited below come from the recently released report from Achieve. You will find a link to the full report below. For further background on the exam and links to released questions, link to my post from April 15, 2008.

If your school district participated this past May or June in the first administration of the Algebra 2 End of Course Exam developed by Pearson for the American Diploma Project you already know the results have been published. Nearly 90,000 students from 12 of the 14 states in the ADP partnership participated.

This post will provide an overview of the full report and some commentary. For general information regarding the exam, look here. Click on the next to last link in the right sidebar - it will take you to a new page which provides an overview of the Annual Report for this exam. The first link will give you the full pdf report. If you're familiar with the Exam, go directly to this new page. Also, for an excellent overview and objective commentary, the Achieve group obtained permission to link to the article in a recent Education Week (3rd link down on the report page). You must adhere to the restrictions about reproduction of this article but it's well worth reading.

When the Calculus Reform group wanted to impact curriculum and instruction in high school (and undergraduate) calculus, how did they do it? They knew the key was to change the AP Calculus Exam: the format, the content, the emphasis (less mechanics, more conceptual, more data-based/modeling open-ended questions, more use of graphing calculator technology).

If NCTM's reforms have not fully been felt K-12 (particularly 7-12), perhaps it's because there is no standardized assessment out there that truly reflects these reforms. It's true that some standardized tests now reflect more problem-solving, data analysis and conceptual understanding, but there's no single powerful test for grades 6-7-8 that will drive change in the classroom. Each individual state has its own independently developed and scored assessment for each grade level now, but the content, difficulty and quality of these tests vary widely. This is why I felt the benefits from the Achieve program far outweighed the potential risks.

Predictably, each time there is a significant change in the AP Exams or the SATs, scores initially drop. This is to be expected and desirable since the appropriate response to this is to understand what needs to be changed in content and instruction. All of the reports and recommendations from the most esteemed mathematics groups/panels have had little effect compared to the more immediate results that follow a drop in scores on some standardized test.

I read the report thoroughly. Passing scores or cutoffs were not determined at this point. Average raw scores and percents were reported for each grade level. It is very hard to draw informed conclusions without an analysis of the questions themselves since the level of difficulty, content and format of these questions are critical factors in performance. Further, scores on a first administration of any standardized test are expected to be lower.

I have not received permission from Achieve to reproduce excerpts so I will summarize major findings.

First I will provide some additional background on the format of the exam itself that you will need to make sense of the results below:

Three types of questions: Multiple-Choice, Short Answer and Extended Response.
A total of 76 raw score points, broken down as follows:

Multiple Choice: 46 questions - 1 pt. ea.
Short Answer: 7 questions - 2 pts. ea.
Extended Response: 4 questions - 4 pts. ea.

Further, the questions are broken into 3 cognitive levels with the majority of questions at Level 2 which "requires students to make some decisions as to how to approaqch the problem or activity."

There was a calculator and a non-calculator part.

For more info regarding the actual topics tested, refer to my link in the first paragraph of this post.

Based on a max of 76 raw score points, the average number of points scored ranged from a high of 39 points (about 50%) for 8th graders to a low of 16 points (about 20%) for 12th graders with a fairly steady decline from 8th through 12th.

MathNotation Commentary:
The decrease from 8th to 12th is easy to explain as the more capable students take the course earlier in accelerated classes. The 8th grade population was of course a very small sample but you get the idea. More significant is the average 24% correct for grade 11, the most common grade for students to take this course (in fact, the number of juniors nearly equaled all of the other grades combined). I'm not surprised by this low percentage for several reasons:
(a) First administration of the test
(b) We already knew there was an issue here or there would have been no impetus for developing uniform standards and a standardized assessment. Are these results so dramatically different from the TIMSS findings? I don't think so. However, there is no cause for alarm. The appropriate response is to provide the data to the states and local districts so that deficiencies can be addressed. I'm not at all concerned about the "Now they'll start teaching to the test" critiques. Those arguments were leveled at AP teachers as well. However, good assessments drive change in content and instruction. Excellent tests can enhance learning -- that's all I ever care about. If this Algebra 2 exam leads to more consistency and higher quality of curriculum and instruction, then everyone should be elated. Unfortunately, each side in the Math Wars will spin the results to make a case for their position. Similarly, Achieve, individual states (governors, state ed departments) will put their spin on it as well. It's up to the reader to become as highly informed as possible to draw her/his own conclusions. Overall, I'm not surprised by the initial outcome.

To be continued...