Opinion
Civil Action 1:23-cv-03910 (CJN)
06-03-2024
MEMORANDUM OPINION
CARL J. NICHOLS, United States District Judge
Plaintiff SCAN Health Plan is a California-based nonprofit health organization that provides health insurance to Medicare beneficiaries. SCAN claims that the government improperly calculated its 2024 Star Rating-a quality assessment that affects both its federal funding and how it is viewed by consumers. The Court agrees with SCAN that the only reasonable interpretation of the relevant regulations requires a different calculation, and therefore grants SCAN's Motion for Summary Judgment and denies the government's.
I. Background
A. The Regulations
Medicare is a federal health insurance program for seniors and people with disabilities. See 42 U.S.C. § 1395 et seq. Beneficiaries can receive coverage from the federal government directly or by enrolling in private health insurance plans that are reimbursed by the government. See 42 U.S.C. §§ 1395c to 1395i-6, 1395j to 1395w-6, 1395w-21 to 1395w-28. The private option is known as Medicare Advantage or Medicare Part C. See Parts of Medicare, Medicare.gov, https:// www.medicare.gov/basics/get-started-with-medicare/medicare-basics/parts-of-medicare (last accessed June 2, 2024). Beneficiaries who choose either option may also choose to supplement their benefits by enrolling in a prescription drug benefit plan known as Medicare Part D. See 42 U.S.C. § 1395w-101 et seq. Those Part D plans are also operated by private insurers. See Id. § 1395w-101(a)(1).
The Centers for Medicare and Medicaid Services is the federal agency that runs the Medicare program. About CMS, https://www.cms.gov/about-cms (last accessed June 2, 2024). As part of its duties, CMS calculates and publishes something called a “Star Rating” for each private Medicare plan. See 42 C.F.R. §§ 422.162(b), 423.182(b). Star Ratings are designed to provide beneficiaries with information about a plan's quality and to enable them and the agency to evaluate a plan's performance. See id. §§ 422.160(b), 423.180(b); see also ECF No. 20-1 (“Decl.”) ¶ 19 (noting that the agency “lists plans on its online Medicare Plan Finder tool in order of highest to lowest Star Ratings” to “steer [beneficiaries] toward higher-rated plans”). CMS is also obligated by statute to offer additional funding to plans with better Star Ratings. See 42 U.S.C. § 1395w-23(o), 1395w-24(b)(1)(C). Those higher-rated plans can then use those extra funds to lower costs for their beneficiaries or to provide them with additional benefits. See 85 Fed.Reg. 33,796, 33,855-56 (2020). The upshot is that Star Ratings are quite important for private Medicare plans.
Every October, CMS publishes new Star Ratings for the upcoming calendar year. ECF No. 26 (“SCAN Mot.”) at 13; ECF No. 23 (“Gov. Mot.”) at 5. (So, for example, the agency published the 2024 Star Ratings in October 2023). CMS calculates its Star Ratings not unlike the way that a teacher might calculate final grades for his or her students. See SCAN Mot. At 10-12 (making this analogy).
First, CMS determines each plan's raw scores on various quality “measure[s].” See 42 C.F.R. §§ 422.162(a), 422.166(a), 423.182(a), 423.186(a). To give just one example, Measure C15 (“Plan All-Cause Readmissions”) is the “[p]ercent of plan members aged 18 and older discharged from a hospital stay who were readmitted to a hospital within 30 days, either for the same condition as their recent hospital stay or for a different reason.” CMS, Medicare 2024 Part C & D Star Ratings Technical Notes (“2024 Technical Notes”) at 60 (March 13, 2024), https:// www.cms.gov/files/document/2024-star-ratings-technical-notes.pdf (last accessed June 2, 2024). In the grading analogy, this is like a teacher's giving students raw scores on a variety of homework assignments, quizzes, essays, and exams.
Second, CMS converts each raw score into a star score. See 42 C.F.R. § 422.166(a), 423.186(a). The rating is on a five-star scale in whole-star increments. Id. §§ 422.166(a)(4), 423.186(a)(4). The key thing to understand about the conversion process is that it grades plans on a curve. For the kind of measures at issue here,CMS runs a statistical “clustering” analysis to group the data set “such that the [raw scores] within a group are as similar as possible to each other, and as dissimilar as possible to [raw scores] in any other group.” Id. §§ 422.162(a), 422.166(a), 423.182(a), 423.186(a). CMS then identifies the dividing lines-or “cut points”- between the groups and assigns star scores accordingly. See id. §§ 422.166(a), 423.186(a). In the grading analogy, this is like a teacher's analyzing all students' scores on a quiz; determining that (for this particular quiz) a student needs to score at least 86% to receive an “A,” at least 78% to receive a “B,” at least 71% to receive a “C,” and so forth; and then giving students the letter grades that correspond to their raw scores. In the parlance of Star Ratings, 86%, 78% and 71% would be the “cut points” reflecting the dividing lines between the different letter grades.
CMS has two types of measures: CAHPS measures (which are based on data from surveys) and non-CAHPS measures (which are based on data from other sources). See 42 C.F.R. §§ 422.162(a), 423.182(a). “Only non-CAHPS [measures] are at issue in this case.” SCAN Mot. at 12 n.5.
Third, CMS calculates a plan's overall Star Rating by running a weighted average of all measures. 42 C.F.R. § 422.166(c)(1), (d)(1); id. § 423.186(c)(1), (d)(1). The rating is again on a five-star scale but in half-star increments. 42 C.F.R. § 422.166(c)(3), (d)(2)(iv); id. § 423.186(c)(3), (d)(2)(iv). In the grading analogy, this is like a teacher's determining a student's final letter grade by calculating a weighted average of the student's letter grades on the various homework, quizzes, essays, and exams completed in the course.
This suit relates to two recent changes to the way that CMS calculates Star Ratings. The first is what the Court will call the Guardrail Rule. “To increase the predictability of the cut points,” 84 Fed.Reg. 15,680, 15,754 (2019), CMS decided in April 2019 to place a 5% cap on how much cut points could change from year to year:
[CMS will apply] a guardrail so that the measure-threshold-specific cut points for non-CAHPS measures do not increase or decrease more than the value of the cap from one year to the next. The cap is equal to 5 percentage points for measures having a 0 to 100 scale (absolute percentage cap) or 5 percent of the restricted range for measures not having a 0 to 100 scale (restricted range cap).Id. at 15,830, 15,842 (amending 42 C.F.R. §§ 422.166(a)(2)(i) (Part C) and 423.186(a)(2)(i) (Part D)). In the grading analogy, this would be like a teacher's deciding to increase the predictability of the score required for a student to get an “A” on the final exam. Accordingly, if students in Year One needed to score at least 86% on the exam to receive an “A,” students in Year Two would need to score no more than 91% to merit the same letter grade-no matter the actual distribution of their scores. (In other words, all students in Year Two scoring above 91% on the final exam would receive an “A” on the final even if that resulted in many more students receiving an “A” on the final in Year Two than in Year One.) CMS first implemented the Guardrail Rule in October 2022 when it calculated the 2023 Star Ratings. See 85 Fed.Reg. 19,230, 19,274-75 (2020).
The second change is what the Court will call the Tukey Outlier Rule. In layman's terms, Tukey outliers are extreme outliers on the high and low ends of a data set. See 42 C.F.R. §§ 422.162(a), 423.182(a)). In June 2020, CMS decided to clean such outliers from the raw data before calculating cut points in order “to stabilize cut points and prevent large year-to-year fluctuations in cut points caused by the scores of a few contracts.” 85 Fed.Reg. 33,796, 33,833 (2020). CMS therefore added the following text to the same subsections of the Code of Federal Regulations containing the Guardrail Rule: “[P]rior to applying mean resampling with hierarchal clustering, Tukey outer fence outliers are removed.” Id. at 33,907, 33,911 (amending 42 C.F.R. §§ 422.166(a)(2)(i) (Part C) and 423.186(a)(2)(i) (Part D)). In the grading analogy, this is like a teacher's throwing out perfect scores and very poor scores before calculating the curve for a quiz. The agency first implemented the Tukey Outlier Rule in October 2023 when it calculated the 2024 Star Ratings. See id. at 33,836.
Accordingly, at the time that SCAN's 2024 Star Rating was published in October 2023, the relevant CFR subsections read as follows:
The method maximizes differences across the star categories and minimizes the differences within star categories using mean resampling with the hierarchal clustering of the current year's data. Effective for the Star Ratings issued in October 2023 and subsequent years, prior to applying mean resampling with hierarchal clustering, Tukey outer fence outliers are removed. Effective for the Star Ratings issued in October 2022 and subsequent years, CMS will add a guardrail so that the measure-threshold-specific cut points for non-CAHPS measures do not increase or decrease more than the value of the cap from 1 year to the next. The cap is equal to 5 percentage points for measures having a 0 to 100 scale (absolute percentage cap) or 5 percent of the restricted range for measures not having a 0 to 100 scale (restricted range cap). New measures that have been in the Part C and D Star Rating program for 3 years or less use the hierarchal clustering methodology with mean resampling with no guardrail for the first 3 years in the program.42 C.F.R. §§ 422.162(a)(2)(i) (Part C), 423.182(a)(2)(i) (Part D).
In the long run, these two changes complement each other. The Tukey Outlier Rule increases the stability and predictability of cut points by removing extreme, fleeting outliers from the data before those outliers can skew the curve. And the Guardrail Rule increases the stability and predictability of cut points by imposing a limit on their ability to change from year to year.
In the short run, however, the story is different. That is because the agency implemented the Guardrail Rule before it implemented the Tukey Outlier Rule. Although the latter removes extreme outliers in both directions, there tend to be more such outliers on the lower end of the data sets at issue here than the higher end. See, e.g., 85 Fed.Reg. 9,002, 9,044 (2020). As a result, removing Tukey outliers resulted in significant changes in some cut points-in other words, removing Tukey outliers in a particular year would tend to increase certain cut points much more than the 5% limit contained in the Guardrail Rule. Compare 2024 Technical Notes at 36-110, with CMS, Medicare 2023 Part C & D Star Ratings Technical Notes (“2023 Technical Notes”) at 26-104 (January 19, 2023), https://www.cms.gov/files/document/2023-star-ratings-technical-notes.pdf (last accessed June 2, 2024). The Guardrail Rule would thus dampen the effect of the Tukey Outlier Rule if cut points calculated from data sets without outliers were tied to older cut points calculated from data sets with outliers-meaning that it might take years for the Tukey Outlier Rule to take full effect.
To address this issue, CMS determined to essentially waive the application of the Guardrail Rule for one year. In particular, rather than applying the Guardrail Rule to the actual cut points from the previous year, CMS would apply the Guardrail Rule to hypothetical cut points for the previous year, which it would calculate using the previous year's data but with Tukey outliers removed. The problem is that CMS never amended its regulations to reflect that decision, at least not expressly. Instead, CMS announced that approach in two preambles contained exclusively in the Federal Register. Specifically, when CMS first proposed the Tukey Outlier Rule in February 2020, it stated that “In the first year that [Tukey outlier deletion] would be implemented, the prior year's thresholds would be rerun, including mean resampling and Tukey outer fence deletion so that the guardrails would be applied such that there is consistency between the years.” 85 Fed.Reg. at 9,044. And when CMS finalized the Tukey Outlier Rule several months later, it made similar statements. 85 Fed.Reg. at 33,833 (“We explained that under our proposal in the first year of implementing this process, the prior year's thresholds would be rerun, including mean resampling and Tukey outer fence deletion so that the guardrails would be applied such that there is consistency between the years.”); Id. at 33,835 (“As noted in the NPRM, for the first year (2024 Star Ratings), we will rerun the prior year's thresholds, using mean resampling and Tukey outer fence deletion so that the guardrails would be applied such that there is consistency between the years.”).
One quirk about the regulatory history bears mentioning. In May 2022, CMS accidentally removed the Tukey Outlier Rule from the Code of Federal Regulations when it finalized separate regulatory changes related to the COVID-19 pandemic. See 87 Fed.Reg. 27,704, 27,809-14, 27,895 (2022). CMS eventually realized its mistake and in April 2023 put the relevant language back into the Code. See 88 Fed.Reg. 22,120, 22,295, 22,332, 22,338 (2023). Although the Parties dispute the significance of those changes, the Court need not address them because it concludes that SCAN would prevail even if they had not occurred.
B. The Present Controversy
SCAN is a nonprofit health organization that offers Part C and Part D coverage to Medicare beneficiaries in California, Arizona, Nevada, New Mexico, and Texas. Decl. ¶¶ 5-8, 15. Between 2018 and 2023, it enjoyed a Star Rating of 4.5 stars. Id. ¶ 23. SCAN used the additional funding it received for being a highly rated provider to reduce costs for its members and offer them additional benefits, such as dental and vision. Id. ¶ 27.
In September 2023, however, CMS informed SCAN that its 2024 Star Rating would drop to 3.5 stars. Decl. ¶ 33. After reviewing the agency's calculations, SCAN determined that part of the change in its Star Rating could be attributed to CMS's decision not to apply the guardrail to the previous year's actual cut points. See id. ¶ 34. Had CMS done so, SCAN would have received a higher rating on two measures and an overall Star Rating of 4 stars-a rating that would make the organization eligible for approximately $250 million in additional funding from the federal government. See id. ¶¶ 41-45, 57. (In other words, the removal by CMS of Tukey outliers in recalculating the prior year's cut points resulted in SCAN's star rating being half of a star lower than it would have been otherwise.) SCAN therefore informally requested that CMS recalculate SCAN's Star Rating. See id. ¶¶ 35-40.CMS refused on the ground that applying the guardrail to hypothetical cut points that it recalculated after removing Tukey outliers from the previous year's data was consistent with its regulations. See id.
Although CMS offers a process for plans to appeal their eligibility for a certain kind of additional funding, it does not allow them to challenge “the methodology for calculating the star ratings” or “the cut-off points for determining measure thresholds.” 42 C.F.R § 422.260(c)(3)(ii). CMS does not argue here that it did not take final agency action with respect to SCAN's 2024 Star Rating.
In December 2023, SCAN filed this suit. ECF No. 1 (“Compl.”) ¶¶ 123-31. It argues that CMS failed to follow its own regulation in calculating the organization's Star Rating and that its action was therefore arbitrary and capricious. See id. ¶¶ 127-29. Since then, SCAN and the government have cross-moved for summary judgment. Following a hearing on May 24, 2024, these motions are now ready to be resolved.
SCAN has also raised a challenge to the way that CMS rated the organization on another measure related to interpreter availability in call centers. SCAN Mot. at 3-5. The Court declines to resolve that issue because its decision with respect to the guardrail issue is sufficient to provide SCAN the relief it seeks. See SCAN Mot. at 5; ECF No. 27 (“SCAN Reply”) at 5.
II. Legal Standards
Under the Administrative Procedure Act, the Court “shall . . . hold unlawful and set aside agency action, findings, and conclusions found to be . . . arbitrary, capricious, an abuse of discretion, or otherwise not in accordance with law.” 5 U.S.C. § 706. “Although it is within the power of an agency to amend or repeal its own regulations, an agency is not free to ignore or violate its regulations while they remain in effect.” Nat'l Env't Dev. Assn.'s Clean Air Project v. EPA, 752 F.3d 999, 1009 (D.C. Cir. 2014) (quotation omitted and alterations adopted). Accordingly, “an agency action may be set aside as arbitrary and capricious if the agency fails to comply with its own regulations” in taking that action. Id. (quotation omitted).
III. Analysis
Although the Medicare regulations at issue here can appear daunting, the question presented in this case is ultimately a simple one. As the Parties acknowledged during the hearing, SCAN does not challenge CMS's statutory authority to remove Tukey outliers, to apply the Guardrail Rule to hypothetical cut points for one year, see Tr. at 4-6, 26-28, or perhaps even to change or eliminate the guardrail altogether. Instead, the sole question is whether the agency's current regulations permit it to apply the guardrail to a prior year's recalculated hypothetical cut points instead of actual cut points.
The plain text of the Guardrail Rule forecloses the government's position. The Guardrail Rule states that CMS will apply “a guardrail so that the measure-threshold-specific cut points for non-CAHPS measures do not increase or decrease more than the value of the cap from 1 year to the next.” 42 C.F.R. §§ 422.166(a)(2)(i), 423.186(a)(2)(i). The rule therefore instructs the agency to look at the “cut points . . . from 1 year” when calculating the “cut points” for “the next” year. The best and most natural reading is that this regulation refers to the actual cut points in the initial year just as it refers to the actual cut points that will be created for the next year. Indeed, in calculating the 2023 Star Ratings, CMS applied the guardrail to the “prior year's [actual] cut point.” 2023 Technical Notes at 139; see also id. at 9 (“[E]ach 1 to 5 star level cut point is compared to the prior year's value and capped ” (emphasis added)).
In fact, the relevant definitions repeatedly provide that the guardrail is applied to “the prior year's . . . cut point”-a phase that can only be reasonably read to refer to the prior year's actual cut point. 42 C.F.R. §§ 422.162(a), 423.182(a). “Guardrail,” for example, is defined as a “bidirectional cap that restricts both upward and downward movement of a measure-threshold-specific cut point for the current year's measure-level Star Ratings as compared to the prior year's measure-threshold-specific cut point.” Id. (emphases altered). “Cut point cap,” in turn, is defined as “a restriction on the change in the amount of movement a measure-threshold-specific cut point can make as compared to the prior year's measure-threshold-specific cut point.” Id. (emphases altered). And “[a]bsolute percentage cap” is “a cap applied to non-CAHPS measures that are on a 0 to 100 scale that restricts movement of the current year's measure-threshold-specific cut point to no more than the stated percentage as compared to the prior year's cut point.” Id. (emphases altered). Again, these definitions can only be reasonably read to refer to the prior year's actual cut point.
The government contends that these provisions do not state that they were referring to a prior year's “actual, unadjusted cut points.” Gov. Mot. at 26. But that is the meaning that any ordinary English speaker would take from an unqualified reference to “the prior year's cut point.” Had CMS wanted to apply a policy designed to increase the predictability of cut points to anything other than actual cut points, one might have expected it to say so explicitly. Indeed, the regulation shows that CMS knew how to waive the application of the guardrail for limited periods. The sentence immediately following the definition of the guardrail states: “New measures that have been in the Part C and D Star Rating program for 3 years or less use the hierarchal clustering methodology with mean resampling with no guardrail for the first 3 years in the program.” 42 C.F.R. §§ 422.166(a)(2)(i), 423.186(a)(2)(i) (emphasis added). And again, there is no dispute that CMS had the statutory authority to do so. See Tr. at 4-6, 26-28.
In other places, the regulations even explicitly exclude Tukey outliers from a prior year's data before applying the guardrail. The guardrail applies differently to two types of non-CAHPS measures: “measures having a 0 to 100 scale” and “measures not having a 0 to 100 scale.” 42 C.F.R. §§ 422.166(a)(2)(i), 423.186(a)(2)(i). The guardrail for the latter type is “5 percent of the restricted range,” Id., which is defined as the difference between the minimum and maximum raw score from the prior year-but only after Tukey outliers have been excluded from the data, Id. § 422.162(a) (“Restricted range is the difference between the maximum and minimum measure score values using the prior year measure scores excluding outer fence outliers (first quartile-3*Interquartile Range (IQR) and third quartile + 3*IQR).”); Id. § 423.182(a) (same); see also id. § 422.162(a) (defining “Tukey outer fence outliers” as “measure scores that are below a certain point (first quartile-3.0 x (third quartile-first quartile)) or above a certain point (third quartile + 3.0 x (third quartile-first quartile)).”); id. § 423.182(a) (same). In contrast, the relevant regulations here do not state anywhere that the guardrail would be applied to hypothetical cut points.
Against this backdrop, the government advanced (for the first time at the motions hearing) a reading of the regulations that would permit it to apply the guardrail to hypothetical cut points. According to the government, the Tukey Outlier Rule-which provides that “Effective for the Star Ratings issued in October 2023 and subsequent years, . . . Tukey outer fence outliers are removed,” 42 C.F.R. §§ 422.166(a)(2)(i), 423.186(a)(2)(i) (emphasis added)-requires (or at least authorizes) the “remov[al]” of Tukey outliers in all data sets, including those corresponding to previous years, Tr. at 7-8, 11-12. There are at least two problems with this theory. First, the sentence immediately preceding the sentence on which the government now relies states that “[t]he [clustering] method maximizes differences across the star categories and minimizes the differences within star categories using mean resampling with the hierarchal clustering of the current year's data.” 42 C.F.R. §§ 422.166(a)(2)(i), 423.186(a)(2)(i) (emphasis added). The “remov[al]” of Tukey outliers is therefore best understood to mean the “remov[al]” of Tukey outliers in “the current year's data.” And that says nothing about whether and how Tukey outliers might be removed from the prior year's data. Second, the guardrail is applied to cut points (not the underlying data), cf. Tr. at 10 (acknowledging that there are “actual cut points” that exist for the previous year), and the language on which the government relies makes no reference to cut points-let alone a reference that is express enough to make ambiguous the otherwise unambiguous regulatory language discussed above.
Alternatively, the government argues that its preamble statements (which are certainly more express than the regulations) are legally binding and permit it to apply the 5% guardrail to hypothetical cut points. Gov. Mot. at 22-27. It appears to argue that any statement in a preamble that goes through notice and comment and that is clearly intended to be binding is a legislative rule with the force of law. See Tr. at 6-7, 14-17; Gov. Mot. at 22; ECF No. 30 (“Gov. Reply”) at 2. But that is not the law in this Circuit. As the Court of Appeals put it in AT&T Corp. v. FCC, “[p]ublication in the Federal Register does not suggest that the matter published was meant to be a regulation.” 970 F.3d 344, 350 (D.C. Cir. 2020) (quotation omitted). Instead, “the real dividing point between the portions of a final rule with and without legal force is designation in the Code of Federal Regulations.” Id. (quotation omitted). As a result, when “there is a discrepancy between the preamble and the Code, it is the codified provisions that control.” Id. at 351. To be sure, the Court of Appeals has “reserved a possibility that statements in a preamble may in some unique cases constitute binding, final agency action susceptible to judicial review,” but it has made clear that “this is not the norm.” Id. at 350 (quotation omitted). And here, the government has not demonstrated that this case is the exception rather than the rule. After all, as discussed above, the text of the regulation leaves only one reasonable interpretation.
Accordingly, the government's passing request to defer to its interpretation of the regulation fails. See Gov. Mot. at 27. Even assuming that these preamble statements are the sort of interpretations that are eligible for deference, the Court must exhaust the “traditional tools of construction” to determine whether there is a correct “answer” to the interpretative questioni.e., the only “reasonable construction of [the] regulation”-before deferring. Kisor v. Wilkie, 588 U.S. 558, 575 (2019). The Court has done so here and determined that SCAN's interpretation of the regulation is the only reasonable one.
Finally, the government raises two non-merits challenges. First, it argues that SCAN waived this issue because it failed to raise it during a comment in any rulemaking. Gov. Mot. at 34-35. But a party may raise such issues on an as-applied basis when a regulation is applied to it, as here. See Koretoff v. Vilsack, 707 F.3d 394, 399 (D.C. Cir. 2013). Second, the government argues that any errors made by CMS were harmless because SCAN had notice of CMS's intent to apply the guardrail based on hypothetical cut points and an opportunity to comment on that proposed policy. Gov. Mot. at 35-37. But the test whether an error is harmless depends on whether the error will “prejudice” the regulated party. Jicarilla Apache Nation v. U.S. Dep't of Interior, 613 F.3d 1112, 1121 (D.C. Cir. 2010) (quotation omitted). That standard is met here because CMS's failure to follow its own regulation led it to give SCAN an incorrect Star Rating that will cost the organization millions of dollars. See Decl. ¶ 57.
That leaves only the proper remedy. SCAN argues that this Court should “set aside” Scan's rating and “enjoin the Defendants from using CMS's unlawful 3.5-star Star Rating in determining SCAN's eligibility for quality bonus payments” (one form of the additional funding that the federal government offers highly rated Medicare plans). SCAN Mot. At 43. The government does not appear to contest that SCAN is entitled that relief if the Court grants its summary-judgment motion. Accordingly, the Court will treat SCAN's remedy arguments as conceded and grant SCAN the relief it requests. Am. Waterways Operators v. Regan, 590 F.Supp.3d 126, 138 (D.D.C. 2022).
IV. Conclusion
For the foregoing reasons, the Court grants SCAN's Motion for Summary Judgment and denies the government's Motion. An order will issue contemporaneously with this opinion.