Saturday, December 11, 2004
Evaluating evaluations
[Part three in a series on educational reform. See below for parts one and two.]
Update: If you want a good summary of the standardized tests vs. portfolio debate, read this article by Jay Mathews of the Washingon Post. Read it, however, in the context of everything I've been saying so far regarding the underlying goals of education and questioning the oft-unquestioned logic of evaluation.
Let's talk about subjectivity, and let's talk about evaluation -- which is a roundabout way of saying, let's talk about testing.
Pragmatically, schools need some system of evaluation. Ideologically, that's not as clear-cut; any system whose external logic includes a measure on which students will be ranked offers an incentive to do only what it takes to achieve that self-interest rather than just learning because learning makes you a productive, intelligent person. However, we live in a world of supply and demand, and colleges and employers only have so many slots to fill and need some way to discriminate among applicants. Moreover, it would be nice to have an occasional method of check progress in elementary, middle and high schools, if for no other reason than to prove that our proposed paradigm works.
So there's the set-up of the problem: We need some system of evaluation, but it can't be one which detracts from the goal of learning how to think rather than learning things to think.
Option 1: Standardized multiple-choice tests. The bulk of current evaluation is done through this method. Prominent examples include the Virginia SOLs and the SAT (though it's adding an essay section -- more on that later). The supposed advantage of this system is twofold: universality and objectivity. Children in Topeka and San Diego are answering the same questions, and there's only one right answer.
The problem with this, of course, is that standardized multiple-choice tests offer neither universality nor objectivity, and worse yet they encourage teaching rote facts rather than comprehensive understanding. Demographics play a role, much as many people would rather they not; a dilapidated school with 75% of its children on free or reduced lunch and full of refugees or illegal immigrants is not on the same playing field as an affluent counterpart. Results from the two schools are not reflective of the actual learning going on inside.
Then we get to objectivity, and I want to digress for a moment to talk about this in depth. Setting aside for a moment the fact that multiple choice questions rarely have a truly objective answer (If you asked "What is the fastest way to get to the store?" with pictures of a horse, a boat, a car and a person walking, the answer to a kid who came out of a Kenyan refugee camp might be walking -- they don't have roads! Of course, that would be the "wrong" answer), let's talk about whether objectivity is necessarily superior.
Objectivity, of course, stands in contrast to subjectivity. But we often indict subjectivity without thinking about why. Consider the analogy of a judge. Here we have a person who has gone through extensive training, moved up through a highly competitive profession and has a wealth of experience. Each case before our judge has different merits, different circumstances, different people. The entire point of a judge is to be able to do just that -- judge. Yet, over recent years we have seen a huge push for mandatory minimums, a matrix box which pre-determines sentences for certain drug offenses and strips the judge of his discretionary power. So, the 18-year old kid with no criminal record headed for Stanford whose dad just died of a heart attack and is caught with the tiniest amount of crack HAS to be thrown away for 5 years. The judge might speak out passionately against the sentence at the same time he's delivering it, saying that the kid clearly posed no danger to society and that the penalty was brutally excessive -- but his hands are tied. Discretion, and subjectivity, are not necessarily vices. Especially when you're dealing in a realm where every situation, every schoolchild is intrinsically different, how can we reduce their achievement to a matrix box, to a bubble sheet?
Obviously, teachers right now are not vetted to the extent of a judge, but under our reforms the average quality of the profession would skyrocket (that's for a later post). Assuming for a moment that teachers are as qualified as judges, and it seems equally ridiculous to slap mandatory minimums on them; yet, that is precisely what standardized multiple-choice tests are. Both are "objective," both are "consistent," and both are "fair." Neither, of course, is any of those things.
So, what are some other options?
Option 2: Portfolios. A portfolio evaluation is exactly what it sounds like; the student compiles all of his or her work from throughout the year and presents it to a panel usually consisting of his or her teacher, one school administrator, and another teacher from the same grade. There is an emphasis in this system on oral presentation and in-depth reports. Tests are sometimes taken completely out of the equation.
The costs and benefits of portfolios should be obvious; they provide a far more comprehensive review of a student's true understanding, but the results are hard to compare across the board and the panel has an incentive to evaluate highly if that's the metric that will be used to review the school. Also, it is much easier for a parent to cry foul if her or his kid doesn't pass muster. So while this is an ideologically superior method to standardized MC tests, it provides a host of logistical challenges which some proponents have found prohibitive.
Option 3: Hybrid (My proposal). If you can't get rid of standardized tests altogether, at least set them up so that the more you learn how to think, the better you will do. This means an emphasis on short answer and essay questions -- no multiple choice. Moreover, they have to be well-designed questions that truly test understanding.
For instance, one of the best essay questions I've ever had in college was in my history of the Middle East class when I was asked, "Explain why the 1906 Persian revolution failed while the 1979 revolution succeeded." Here was a question which you could take in any of a thousand different angles; it was interesting; it was a topic that there is no consensus on whatsoever; it wasn't something I could bulls*** around -- I had to know my stuff. However, a bad question, and one you're more likely to see on essay questions in high school, would have been "What was the cause of the 1979 Iranian revolution?" That doesn't necessarily demand insight or higher-level understanding, just paying attention in class/to the readings.
If you can line up what tests demand and what you should be learning (i.e. methodologies of thought), then you can circumvent the problem of teaching to the test while maintaining the integrity of a universal testing regime which society demands. Oh, and take demographics into account.
Now, the primary problem with my system is the same as with portfolios -- grading of essays is subjective, and teachers might have incentive to not grade accurately. This is where the SAT is encouraging. There is obviously enough of an understanding that MC is not holistic enough, because they're adding an essay section (which will be subjectively graded!) to the one test that nearly every college-bound high school student in America takes. That hurdle might not be so insurmountable as originally thought.
This post is incredibly long, but I want to make one last point about evaluation: My junior year H.S. physics teacher used a 4-point scale for grading instead of the standard 100-point scale. Why? He explained simply, "I don't know the difference between a 87 problem set and an 88 problem set. I do know the difference between a 3.5 and a 3." One way to dampen the subjectivity "flaw" is to establish broad categories of evaluation that guide -- but don't constrain -- the teacher's opinions.
Evaluation is a fundamental part of modern American education, and it can't be ignored in the midst of reforms because it influences so heavily the direction of classes. It can't be taken away altogether, so it has to be made to work for us.
Find me a bubble sheet that can express all that.
Comments:
<< Home
Okay, but how would you test math or science? I suppose you could do this with short essays, but it would be a bizarre and cumbersome way to test knowledge of these fields.
Your short-essay proposal would work fine for a writing exam, which is why it is already used for writing exams. It would not work well for history, because it would not be able to test the breadth of a student's knowledge, unless the essay questions were ridiculously open-ended. The AP history exam do includes essays, along with multiple-choice, but the SAT II history does not, and I think if you compared the same students' scores on the two, they would be very similar. Probably the only students who would get markedly different scores on the two tests, of course, are those who do not write well-- because they are dyslexic or not native English-speakers.
So your "hybrid" proposal would have no impact except to hurt poor and unfortunate people who are dyslexic or have learning disabilities or are not native English speakers. Some liberal you are!
Your short-essay proposal would work fine for a writing exam, which is why it is already used for writing exams. It would not work well for history, because it would not be able to test the breadth of a student's knowledge, unless the essay questions were ridiculously open-ended. The AP history exam do includes essays, along with multiple-choice, but the SAT II history does not, and I think if you compared the same students' scores on the two, they would be very similar. Probably the only students who would get markedly different scores on the two tests, of course, are those who do not write well-- because they are dyslexic or not native English-speakers.
So your "hybrid" proposal would have no impact except to hurt poor and unfortunate people who are dyslexic or have learning disabilities or are not native English speakers. Some liberal you are!
Mr./Mrs. Anonymous,
You're right, and I should have clarified for math and the sciences. Essays aren't always preferable in those subjects, but word problems certainly are. Instead of testing rote memorization of formulas, test actual comprehension by throwing a word problem which requires some extrapolation and high-level thinking to answer well.
As to your point about history, I couldn't disagree more. "Ridiculously open-ended questions" ARE the best ways to see if someone actually understands the subject material and has grasped the broader implications as it relates to methodology. As a history major, I can tell you from first-hand experience that the close-hold essay questions which require simply having attended class or reading the material don't represent anything except your work ethic -- or how much grades matter to you as an incentive. Short or long (most exams include both), you can develop questions which test true comphrension.
Lastly, I'm not really sure your posit that scores among the SAT II History and AP History is a legitimate comparison. I have no data yet -- and neither do you -- but intuitively, having gone through primary and secondary education, I know that the emphasis on facts (places, names, events) is stronger than the emphasis on broad understanding. But I also know that the AP History "DBQ" -- the big, long essay -- is a joke. I think mine was to write about the causes of the Cold War or somesuch. If essay questions don't demand true comprehension, then certainly the same people who get away on MC with knowing a lot of facts can get away on bad essay questions by the same token.
-Elliot
[P.S. As to the concern of those who are bad writers, the answer is first of all to take demographics and extenuating circumstances (dyslexia, e.g.) into account. Secondly, a flowing eloquence shouldn't ever be prioritized over deeply understanding the material -- below-average writers can still convey comprehension, though hopefully our educational approach would breed more solid writers.]
You're right, and I should have clarified for math and the sciences. Essays aren't always preferable in those subjects, but word problems certainly are. Instead of testing rote memorization of formulas, test actual comprehension by throwing a word problem which requires some extrapolation and high-level thinking to answer well.
As to your point about history, I couldn't disagree more. "Ridiculously open-ended questions" ARE the best ways to see if someone actually understands the subject material and has grasped the broader implications as it relates to methodology. As a history major, I can tell you from first-hand experience that the close-hold essay questions which require simply having attended class or reading the material don't represent anything except your work ethic -- or how much grades matter to you as an incentive. Short or long (most exams include both), you can develop questions which test true comphrension.
Lastly, I'm not really sure your posit that scores among the SAT II History and AP History is a legitimate comparison. I have no data yet -- and neither do you -- but intuitively, having gone through primary and secondary education, I know that the emphasis on facts (places, names, events) is stronger than the emphasis on broad understanding. But I also know that the AP History "DBQ" -- the big, long essay -- is a joke. I think mine was to write about the causes of the Cold War or somesuch. If essay questions don't demand true comprehension, then certainly the same people who get away on MC with knowing a lot of facts can get away on bad essay questions by the same token.
-Elliot
[P.S. As to the concern of those who are bad writers, the answer is first of all to take demographics and extenuating circumstances (dyslexia, e.g.) into account. Secondly, a flowing eloquence shouldn't ever be prioritized over deeply understanding the material -- below-average writers can still convey comprehension, though hopefully our educational approach would breed more solid writers.]
As to essay questions on math and science--
There are actually disciplines-- philosophy of science, symbolic logic-- which seek to study the thought processes behind mathematical and scientific claims. These are very interesting fields, and the teaching of these disciplines in primary and secondary school would seem to be consistent with the argument you have put forth. However, at some point we actually do want students to have certain specific skills and to know certain "facts," and not simply the understanding of the thought processes involved in getting such skills. So we teach math and science. Now, essay questions would test knowledge of phil. of science just fine, but the fact that we actually require the attainment of basic math and science skills instead of the thought processes behind these skills is evidence that your solution is not feasible.
Would I prefer to have taken philosophy of science in high school instead of chem? Absolutely, but what about the many students who are actually interested in math or science, and will use high school courses in these as the basis for future study?
You are right, there is a sense in which standardized tests are not objective. But this is because the entire educational system involves subjective choices about what students should learn, not any specific fault of the test.
My point about "ridiculously open-ended questions" was that to test knowledge of history with any breadth these questions would have to be so broad as to be meaningless, because a subject like "American history" or "world history" is so broad that writing questions with any specific content, even with choices of essay, would exclude vast amounts of historical knowledge that could be tested by multiple choice questions. In college history classes, this criticism is not applicable, because the courses themselves deal with limited subject areas and are often thesis-driven.
Also, what's wrong with the DBQ? It seems to fit very well with your vision of an educational system that requires critical thinking. The question of what caused the Cold War is one that you could approach from a thousand different angles (Stalin's perspective, Truman's perspective, an IR perspective, a historical perspective) and one which there is no consensus on (indeed, it is one of the most hotly debated in terms of causation-- was it a result of realist power politics or actual ideological conflict?) Additionally, if the DBQ asked for broad historical comparisons over different time periods, it would be much more problematic. From a historiographal perspective, the question of whether these types of broad historical comparisons are of any value is a contested one, with many claiming that they are meaningless.
Finally, writing is difficult for people with learning disabilities and non-native English speakers on such a broad level that it would be reflected in every aspect of the writing process (organization, etc.), not just stylistically.
There are actually disciplines-- philosophy of science, symbolic logic-- which seek to study the thought processes behind mathematical and scientific claims. These are very interesting fields, and the teaching of these disciplines in primary and secondary school would seem to be consistent with the argument you have put forth. However, at some point we actually do want students to have certain specific skills and to know certain "facts," and not simply the understanding of the thought processes involved in getting such skills. So we teach math and science. Now, essay questions would test knowledge of phil. of science just fine, but the fact that we actually require the attainment of basic math and science skills instead of the thought processes behind these skills is evidence that your solution is not feasible.
Would I prefer to have taken philosophy of science in high school instead of chem? Absolutely, but what about the many students who are actually interested in math or science, and will use high school courses in these as the basis for future study?
You are right, there is a sense in which standardized tests are not objective. But this is because the entire educational system involves subjective choices about what students should learn, not any specific fault of the test.
My point about "ridiculously open-ended questions" was that to test knowledge of history with any breadth these questions would have to be so broad as to be meaningless, because a subject like "American history" or "world history" is so broad that writing questions with any specific content, even with choices of essay, would exclude vast amounts of historical knowledge that could be tested by multiple choice questions. In college history classes, this criticism is not applicable, because the courses themselves deal with limited subject areas and are often thesis-driven.
Also, what's wrong with the DBQ? It seems to fit very well with your vision of an educational system that requires critical thinking. The question of what caused the Cold War is one that you could approach from a thousand different angles (Stalin's perspective, Truman's perspective, an IR perspective, a historical perspective) and one which there is no consensus on (indeed, it is one of the most hotly debated in terms of causation-- was it a result of realist power politics or actual ideological conflict?) Additionally, if the DBQ asked for broad historical comparisons over different time periods, it would be much more problematic. From a historiographal perspective, the question of whether these types of broad historical comparisons are of any value is a contested one, with many claiming that they are meaningless.
Finally, writing is difficult for people with learning disabilities and non-native English speakers on such a broad level that it would be reflected in every aspect of the writing process (organization, etc.), not just stylistically.
I think you're misunderstanding what we're proposing. We don't think that facts should be taken out of the equation altogether -- indeed, without understanding simple addition and subtraction, the rest of math, applicable, methodological or otherwise is impossible. However, once you have basic arthimetic skills (and understand WHY they're useful, even if in the simple context of making change) and begin to move onto algebra, then yes, word problems are far superior to multiple choice tests of "plug-and-chug".
Science works similarly. We're not asking people to learn the philosophy behind Chemistry, but we might like them to know more about the concept of breaking things down into constituent parts rather than being able to say what the periodic symbol for Krypton is.
Now, again, don't put words in my mouth: High-level, specialized classes should be avaliable for those who want them. The person who is going to be a chemist probably NEEDS to know the properties of Krypton, and it's periodic symbol. The soon-to-be engineer or mathmetician needs advanced calculus. But those should be electives, not standard cirriculum.
The beauty of our system is that it provides the tools for students to get the details themselves! Of course when you teach history you need to address specifics in order to teach the methodology and paradigms. But instead of giving a play-by-play of the American Revolution, sketch out the general events and background and let the students discover the specifics for themselves, largely self-guided, going on what interests them. Maybe someone wants to research the role of women in the Revolution; maybe someone else wants to look into exactly how France played into the equation. Either way, those students are going to have to reckon with the entire Revolution, but at least they can do it on their own terms, learning not only facts for an informed worldview (hard to understand current events without knowing about the American Revolution) but also those skills which can be applied on a broader scale (no job in the workplace will require specific knowledge of the battle of Yorktown, for example.)
We're not talking about requiring that students understand the totality of knowledge in the abstract. We're suggesting that perhaps a system which promotes methodologies of thinking over rote deatils and is largely self-driven by the students own interests, and paralleling that a regime of evaluation which rewards that kind of holistic comprehension is superior to the current system of "Learn these facts, fill in these facts on a test, and that determines how much you've learned."
-Elliot
Post a Comment
Science works similarly. We're not asking people to learn the philosophy behind Chemistry, but we might like them to know more about the concept of breaking things down into constituent parts rather than being able to say what the periodic symbol for Krypton is.
Now, again, don't put words in my mouth: High-level, specialized classes should be avaliable for those who want them. The person who is going to be a chemist probably NEEDS to know the properties of Krypton, and it's periodic symbol. The soon-to-be engineer or mathmetician needs advanced calculus. But those should be electives, not standard cirriculum.
The beauty of our system is that it provides the tools for students to get the details themselves! Of course when you teach history you need to address specifics in order to teach the methodology and paradigms. But instead of giving a play-by-play of the American Revolution, sketch out the general events and background and let the students discover the specifics for themselves, largely self-guided, going on what interests them. Maybe someone wants to research the role of women in the Revolution; maybe someone else wants to look into exactly how France played into the equation. Either way, those students are going to have to reckon with the entire Revolution, but at least they can do it on their own terms, learning not only facts for an informed worldview (hard to understand current events without knowing about the American Revolution) but also those skills which can be applied on a broader scale (no job in the workplace will require specific knowledge of the battle of Yorktown, for example.)
We're not talking about requiring that students understand the totality of knowledge in the abstract. We're suggesting that perhaps a system which promotes methodologies of thinking over rote deatils and is largely self-driven by the students own interests, and paralleling that a regime of evaluation which rewards that kind of holistic comprehension is superior to the current system of "Learn these facts, fill in these facts on a test, and that determines how much you've learned."
-Elliot
<< Home