Summary
stating that even though criticisms of grading procedure were convincing from pure mathematical standpoint, they still did not show abuse of discretion
Summary of this case from In re ArnovickOpinion
No. S-4985.
November 13, 1992. Rehearing Denied December 8, 1992.
Appeal from the Board of Governors of Bar Association.
Frank J. Bettine, pro se.
Stephen J. Van Goor, Anchorage, for Alaska Bar Ass'n.
Before RABINOWITZ, C.J., and BURKE, MATTHEWS, COMPTON and MOORE, JJ.
OPINION
Frank Bettine appeals the decision of the Board of Governors of the Alaska Bar Association (ABA) to deny his request for a reread and regrade of his bar examination. He argues that the ABA used mathematically incorrect grading procedures in grading his essay exams and employed a reread policy which is contrary to our decision in Application of Obermeyer, 717 P.2d 382, 388 (Alaska 1986). He also challenges the Board's grading and reread policies on equal protection and due process grounds. We affirm the Board's decision.
I. FACTS AND PROCEEDINGS
Mr. Bettine took the bar exam in the summer of 1991. In late October, he was notified that he had failed. He scored a total of 138.5 points on the examination, 1.5 points below the required passing score of 140 points, and .5 point below the score required for a "reread" of his exam. He received a converted essay score of 141.0 on his ten essay answers and a multistate bar examination (MBE) score of 136, which when averaged produced a combined score of 138.5.
II. DESCRIPTION OF THE ALASKA BAR EXAM
In order to understand Bettine's allegations, familiarity with some of the procedures used in administering and grading the exam is required. The exam is a 2 1/2 day exam. One full day is devoted to the nationally administered MBE. The rest of the exam consists of essay questions. Three hour-long essays are weighted at 30% of the total essay score, six half-hour questions account for 45%, and a research/analysis practicum accounts for the remaining 25%. The total essay score is given 50% weight, and the MBE receives the other 50%.
The grading procedures for the essay portion of the exam are governed by Law Examiners Committee regulations. Pursuant to these regulations, the graders first meet to calibrate each particular essay question. At least five graders read and individually score five randomly selected answers on a scale of one to five. They repeat this process several times, with each grader reading at least twenty exams, until the graders agree on a set of answers that are representative of each of the five possible levels. These "benchmark" answers are then used as guides in assigning scores to the remaining papers. Two graders score each essay. The graders are required to agree to within one point on the score assigned to each question, meaning that the graders must regrade questions until they agree within one point. The two scores are then averaged to obtain the applicant's score for the question.
Each applicant's essay scores are then weighted, combined and converted to the same unit of measurement as the MBE score, so that the essay and MBE scores can be combined for the purpose of making pass/fail decisions. An applicant whose combined score falls between 139.00 and 139.99 will have those essays which received a "split" score reread by the two graders. The graders may, but do not have to, change the score upon reread. If upon reread an applicant's combined score is raised to a 140.00 or above, the applicant passes the exam.
Failing applicants are notified in accordance with Alaska Bar Rule 4(4) and are offered the opportunity to inspect their essay examination booklets and grades, as well as a representative sampling of answers from other applicants.
III. STANDARD OF REVIEW
Alaska Statute 08.08.130 and Rule 1 of the Alaska Bar Rules require, as one of the preconditions to admission to the Alaska Bar, that an attorney applicant take and pass a bar examination given by the Alaska Bar Association. Under section 3 of Rule 1, the Board of Governors is responsible for administering the bar exam. Alaska Bar Rule 6 provides a review procedure which must be followed by applicants who wish to challenge their failing grade. Pursuant to Rule 6, an applicant who alleges facts which would establish "an abuse of discretion or improper conduct on the part of the Board" is entitled to a hearing. Rule 8 allows for supreme court review of any interlocutory order of the Board of Governors. In Application of Peterson, 459 P.2d 703 (Alaska 1969), this court wrote that an unsuccessful applicant "carries a heavy burden in regard to any attempted showing of abuse of discretion or improper conduct on the part of the law examiners in the grading of examination papers." Id. at 711.
IV. ISSUES ON APPEAL
There are two main issues on appeal. The first question is whether the ABA used a mathematically incorrect grading procedure in grading Bettine's exam. The second question is whether the ABA's reread policy is contrary to law.
A. Grading Procedures.
Bettine argues that the ABA's grading procedure is "fraught with arithmetic errors which will in many instances yield incorrect test scores." The crux of his argument is that it is error for the ABA to assign him scores with two significant figures (1.0, 1.5, 2.0, etc.), when the graders are only allowed to assign scores with one significant figure (1, 2, 3, etc.). He reasons:
If respondent desires to average the applicant's essay raw score to within two significant digits of accuracy then the scores assigned by the examiners must reflect at least two significant digits of accuracy. Rounding off essay raw scores to a single significant digit of accuracy and then using these scores to calculate an applicant's average score to 0.5 points of accuracy is an incorrect and unacceptable mathematical procedure. By rounding off the raw scores during intermediate stages of computations to single digit values, respondent is in fact robbing Mr. Bettine out of a possible 0.5 points per essay question. If respondent desires to employ a grading policy which requires examiners to grade with a single significant digit of accuracy, then respondent at a minimum must implement a grading policy which assigns the higher of the two scores as the score for any question where the examiners disagree by one point. This policy would help to minimize the affect [sic] of round off error on an applicant's overall essay raw score, because an applicant would receive the benefit of the higher score for those questions where examiners are now forced to round off during intermediate stages of computations.
It is also important to recognize that although the examiners assigns [sic] scores at the essay raw score level accurate to but a single significant digit, the weighted essay scores for Part A, B and C of the essay exam in addition to the applicant's combined score, are calculated to five significant digits of accuracy! It is not possible to start a series of calculations containing a number accurate to but a single significant digit and arrive at an answer with five significant digits of accuracy.
Bettine's Exhibit 1 illustrates that a variation of 0.5 from the raw essay scores can have a great effect on the converted essay score. To eliminate the potentially harmful effects of the ABA's faulty grading procedure on him, he argues, his score should either be rounded off to two significant digits, or 0.5 should be added to Parts A, B and C of his essay raw scores. In either case, he argues, he would achieve a passing score of 140 points.
Bettine also argues that, in addition to the "round off" errors described above, "observation" errors result when an applicant's answer does not precisely coincide with a benchmark answer, since examiners must adjust the true score of an essay to one of the integer set values, adding or discarding fractional points in the process. He writes that "[t]his adjustment will be made either unintentionally by the examiner because of observation error or purposely because the examiner will consciously round off applicant's true score to one of the allowable integer values." Bettine apparently believes that the harmful effects of this "round off" error would be lessened if examiners were not restricted to assigning only one of five test score values.
Bettine posits that the effect of the round off and observation errors is cumulative. He therefore claims that, while the grading error associated with a single question may not be significant, one must assume that all ten of his essay scores were reduced by 0.5 point of grading error in order to determine the maximum effect on his combined score of the cumulative grading errors. His Exhibit 2 shows that 5 points (0.5 points per question times 10 questions) at the raw score level are worth approximately 5.5 points at the combined essay score level. Exhibit 2 also shows that Bettine needs to accumulate only one additional point at the essay raw score level to achieve a combined passing score of 140 points.
In response to Bettine's arguments, the ABA relies primarily on the affidavit of Dr. Stephen Klein, a nationally recognized expert on bar examinations. Dr. Klein does not challenge the mathematical foundations of Bettine's arguments concerning measurement error and significant figures. He instead argues that the principles of numerical analysis, upon which Bettine relies in making his criticisms of the grading procedures, are inapplicable to psychological measurements such as the grading of exams. He testified:
6. Bettine's arguments for changing the score intervals stems from his failure to recognize critical differences between physical and psychological measurement. Physical measurement is used to assess the height, weight, temperature, or other properties of physical things. These measurements are made with rulers, scales, thermometers, Geiger counters, and other mechanical devices. Such measurements can be made with more precision by using an instrument that is marked off in finer gradations, such as by replacing a ruler with a micrometer.
7. In contrast, human judgment is needed to assess the relative quality of a figure skater's performance or a candidate's answer to a bar exam question. Using more score levels (such as by having readers assign grades in half-point intervals) does not increase precision unless readers can reliably distinguish between the quality of answers in adjacent levels. Readers should use only as many score levels as are truly needed to distinguish between answers of different quality.
Klein notes that most states use about five score levels to grade bar exam essay answers, since "[t]hey have found this is about the right number to reflect real differences in answer quality, but not so many as to force readers to make distinctions where there are no reliably identifiable differences."
After reviewing the arguments of Bettine and the ABA, we have concluded that Bettine has failed to make factual allegations sufficient to establish an abuse of discretion or improper conduct on the part of the ABA. Although Bettine's criticisms of the mathematical foundations of the grading procedure are convincing from a purely mathematical standpoint, they are undercut by Dr. Klein's statements that it is inappropriate to strictly apply numerical analysis to psychometric measurements. In view of Dr. Klein's affidavit, it cannot be said that Bettine has satisfied the "heavy burden" of showing an abuse of discretion or improper conduct on the part of the law examiners. Similarly, Bettine's presentation is insufficient to demonstrate that the ABA's grading procedures violated his equal protection or due process rights.
B. Reread Policy.
Bettine's second argument is that the reread policy employed by the ABA is arbitrary and capricious, and that it is contrary to this court's decision in Application of Obermeyer, 717 P.2d 382 (Alaska 1986). In Obermeyer, this court was faced with a challenge to the ABA's policy of rereading the essays of only those exams which received a score between 139.00 and 139.99. Obermeyer maintained that this cut-off is arbitrary because "the statistical variance between 139.00 and 139.99 is only 0.7%, which is considerably smaller than the margin of error permitted by averaging scores assigned by different graders when they differ by only one point." Obermeyer, 717 P.2d at 388. We rejected Obermeyer's claim, stating that the "appropriate comparison would be of the variance created by averaging as a function of the total exam score to the variance allowed by the reread policy." Id. We recognized, however, that there may be "facial validity in [Obermeyer's] argument." Id. We stated that "[i]t certainly seems that the Bar should be willing to allow a reread for at least as large a variance as the margin of error its examiners are allowed in averaging scores." Id. Bettine argues that the ABA has failed to comply with this "order," resulting in a situation where the maximum cumulative error is 5.6 times the variance at the combined score level for which the respondent will presently grant a reread. He claims that the ABA should expand its variance for reread from 1 point to 5.5 points, which represents a variance of 3.9%.
In his reply brief, Bettine claims that the expended reread range which he recommends is consistent with reread policies utilized by both Oregon and Washington. Washington grants a reread for unsuccessful applicants whose score falls within a range of approximately 2.5% to 3.0% of passing, while Oregon grants a reread to the top 30% of applicants who fail the examination.
We do not believe that the ABA abused its discretion in deciding to reread only those exams which come within one point of passing. It is our opinion that the ABA could eliminate its reread policy without abusing its discretion in administering the bar exam. To the extent that our statements in Obermeyer indicate otherwise, we disavow those statements.
AFFIRMED.
In State v. Erickson, 574 P.2d 1 (Alaska 1978), we adopted a single flexible test to review legal challenges to governmental action based upon the Alaska Constitution's equal protection clause. This test, which has come to be known as the sliding scale test, provides for varying levels of scrutiny depending on the importance of the right involved.
This clause provides "that all persons are equal and entitled to equal rights, opportunities, and protection under the law[.]" Alaska Const. art. I, § 1.
In applying the Alaska Constitution . . . there is no reason why we cannot use a single test. Such a test will be flexible and dependent upon the importance of the rights involved. Based on the nature of the right, a greater or lesser burden will be placed on the state to show that the classification has a fair and substantial relation to a legitimate governmental objective.
Id. at 11-12.
Our adoption of this single test reflected our discontent with the United States Supreme Court's rigid two-tier equal protection analysis. In Isakson v. Rickey, 550 P.2d 359 (Alaska 1976), we had begun to move toward a less deferential approach at the lower end of the equal protection spectrum "by raising the level [of scrutiny] of the lower tier from virtual [judicial] abdication to genuine judicial scrutiny." Id. at 363. There we held that
The two-tier analysis requires strict scrutiny of governmental action under the "compelling state interest" standard in cases involving fundamental rights or suspect categories, and "minimal scrutiny" under the rational basis test in all other cases. See, e.g., Kramer v. Union Free School Dist. No. 15, 395 U.S. 621, 626-27, 89 S.Ct. 1886, 1889, 23 L.Ed.2d 583 (1969); Harris v. McRae, 448 U.S. 297, 324, 100 S.Ct. 2671, 2692, 65 L.Ed.2d 784 (1980). For a brief history of this court's dissatisfaction with the two-tier approach, see Isakson v. Rickey, 550 P.2d 359, 361-63 (Alaska 1976).
"[u]nder the rational basis test, in order for a classification to survive judicial scrutiny, the classification `must be reasonable, not arbitrary, and must rest upon some difference having a fair and substantial relationship to the object of the legislation, so that all persons similarly circumstanced shall be treated alike.'"
It is this more flexible and more demanding standard which will be applied in future cases if the compelling state interest test is found inappropriate.
Id. at 362 (quoting State v. Wylie, 516 P.2d 142, 145 (Alaska 1973) (footnote omitted)).
I dissent from the court's resolution of the reread/regrade challenge because the court disregards the scrutiny required by Erickson and Isakson. The court reaches its conclusion without ever examining the importance of Mr. Bettine's legal interest in professional licensure, the ABA's interest in setting the one-point reread threshold, or the nexus between the ABA's interest and its means of achieving that interest. In fact, the court does not appear to apply any test at all.
Bettine falls within the class of applicants for whom a reread/regrade could mathematically produce a passing score — those scoring within 5.5 points of passing. In granting rereads only to applicants scoring within one point of passing, the ABA discriminates against applicants within Bettine's class whose score is lower than the one-point reread threshold. Regardless of whether the ABA is constitutionally compelled to reread any examination, the ABA must treat all persons within that class similarly unless it can articulate a rational basis for discriminating among them. Neither the ABA nor the court has identified any bases for the discrimination.
Bettine's circumstance is not like Mr. Obermeyer's, since a reread/regrade of Obermeyer's bar examination could not have mathematically produced a passing score. See Application of Obermeyer, 717 P.2d 382, 388 (Alaska 1986).
Applying the sliding scale test to a case involving the right to employment, this court must "closely scrutinize enactments which interfere with that right." Matson v. Commercial Fisheries Entry Comm'n, 785 P.2d 1200, 1205 (Alaska 1990). In Matson we noted that "[e]qual protection requires that an enactment impairing this important right be closely related to an important state interest." Id. For all practical purposes, failure to obtain professional licensure forecloses employment opportunity as a lawyer. Examination procedures should be closely scrutinized. However, even if licensure procedures deserve less scrutiny than employment practices, they deserve some scrutiny. The court does not scrutinize the ABA's procedures at all.
The distinction in score between those examinations within Bettine's class which are reread and those which are not is not statistically significant. As noted, no other bases for this distinction have been advanced. The ABA's one-point reread threshold is arbitrary and does not withstand scrutiny under the sliding scale test. Therefore, the procedure denies Bettine equal protection of the law.