Carissa Hessick and I have been debating the appropriateness of using empirical methods in legal interpretation. The debate began on PrawfsBlawg, then moved over here (with some continued discussion at Prawfs), and then spread to Twitter. The relevant tweets are collected in my previous post, and in this post I’ll respond to Hessick’s most recent points.
As I understand her, Hessick contends that the issue of ordinary meaning isn’t an “empirical question” because the question of how a reasonable person would understand the text is inherently qualitative rather than quantitative, and therefore can’t be answered in a way that is “provable or verifiable.” I accept Hessick’s characterization of the ordinary-meaning issue as being qualitative rather than quantitative, but it doesn’t follow that quantitative information is always irrelevant.
In her first set of tweets, Hessick had asked, “Don’t most people think that, if something is an empirical question, that means there is a demonstrably correct answer?” I answered, “I don’t know whether there are people who would think that ‘empirical Q = can be answered definitively,’ but isn’t it pretty clear that that equation doesn’t hold, & that there are empirical Qs that can only be addressed probabilistically?” As examples of what I meant I pointed to weather forecasts and climate-change models. Hessick then responded that the accuracy of a weather forecast can be evaluated by waiting to see what the weather turns out to be.
I don’t think that’s right, strictly speaking; if the forecast says there’s a 75% chance of rain, the fact that it turns out to be sunny doesn’t prove that the estimate of probability was wrong (nor would the fact that it did rain prove that the estimate was right). But let’s put that aside, and look at some examples in which it’s not possible to have the kind of after-the-fact confirmation or disconfirmation that Hessick refers to.
Let’s start by sticking with weather reports. Suppose you have two forecasts for tomorrow’s weather; but one of them uses data from five times as many observation stations as the other one, both in developing its forecasting model (based on past observations and forecasts) and in gathering data about current conditions, to be analyzed for purposes of tomorrow’s forecast. My guess is that when it comes time to decide whether to bring an umbrella with you to work, you’d be more likely to rely on the forecast that is based on more extensive data.
Or for questions focused on describing and explaining what happened in the past, rather than on forecasting what might happen in the future, consider the kinds of questions that come up in the study of evolution: what species are the evolutionary precursors of homo sapiens, when and how did the ability to use language arise, how did insect wings evolve. Research seeking to answer those questions has to be empirically based; at a minimum, any proposed answers have to be consistent with the empirical evidence, and differing approaches are evaluated based on how well they account for that evidence. And progress is made, even though there hasn’t yet been (and may never be) the kind of clear confirmation that Hessick wants to see.
So in considering whether corpus linguistics can be useful in legal interpretation, the question shouldn’t be whether it can provide definitive answers, but whether it can provide information that is relevant and can be useful in the interpretive process. In a series of posts, I’ve defended the idea that frequency data can in appropriate cases satisfy those requirements: Meaning in the framework of corpus linguistics, More on the relevance of frequency data: Responding to Steinberg, and Corpus linguistics: Empiricism and frequency.
The third of those posts responded to the contention in Hessick’s Corpus Linguistics and the Criminal Law that because corpus linguistics relies on frequency information, using it in legal interpretation would represent a radical departure from established interpretive methods. I countered that argument by presenting evidence that frequency has been at least implicitly relevant to legal interpretation at least as far back as Blackstone. It was that post that prompted the twitter exchange that I’m commenting on now.
Hessick writes:
Now, don’t get me wrong, judges often frame what they are doing in terms that sound quantifiable. For example, they might say construction A is the one that meaning that is “most common” or “most commonly understood.” But I think that phrase is probably shorthand for “my intuition about what is most commonly understood.” The reason I think it is a shorthand for that is because the same judges (sometimes in the same opinions) talk about what a “reasonable listener” would understand.
As you know, the “reasonable person” standard is a qualitative standard, not a quantifiable one. And because whether something is reasonable is not (in doctrine) considered to be something you can prove as a quantifiable matter. I don’t see how the reasonable listener test is properly understood as an empirical test.
I agree with Hessick that when judges say that a given meaning of a word is “most common” or “most commonly understood,” they are basing that conclusion on their intuition (assuming they’re not relying on a dictionary). Those intuitions are empirically based in the sense they are based on the judge’s lifetime of experience as a speaker of English, and they are quantitative in that the subject of the intuitions—the relative frequencies of different senses of a word—is by definition quantitative. But experience with corpus linguistics has shown that those kinds of intuitions aren’t reliable, because accurate information about lexical frequency isn’t accessible to conscious awareness; it can be obtained only from corpus data. (And when dictionaries purport to rank word senses by their frequency, the rankings will similarly be unreliable unless they are corpus-based.)
This suggests one answer to Hessick’s argument that considering quantitative data is inappropriate because ordinary meaning is sometimes described as the meaning that a reasonable language user would understand the text to communicate, and the reasonable-person standard is qualitative, not quantitative. A judge’s intuition about what is the most frequent relevant meaning is in essence a quantitative estimate—an unreliable estimate, but a quantitative estimate all the same. And since a quantitative judgment is unavoidable, it doesn’t make sense to rule out consideration of a more reliable data source.
On top of that, even if the reasonable-person metaphor provides a good way to to frame the ordinary-meaning question, the fact is that in the tort law—the reasonable person’s normal habitat—the determination of the standard of care turns in theory on the Hand formula, which is a cost-benefit analysis. And the factors relevant to such an analysis are, at least in theory, quantifiable. So I don’t see any principled basis to rule out the corpus data merely because it has a quantitative component.
That’s not to say that corpus linguistics will be useful in every case where there’s a dispute about the meaning of a statutory term. As I’ve said before, I wouldn’t expect corpus analysis to be appropriate in all cases, and even when it is useful, its importance to the ultimate outcome will probably vary from case to case. So I don’t see it as a panacea, but I also don’t think it should be anathema.