Meaning in the framework of corpus linguistics

At the end of my previous post discussing Carissa Hessick’s paper “Corpus Linguistics and the Criminal Law,” I said that I would follow up with another post “making the affirmative case for the relevance of frequency data in determining ordinary meaning.” This is that post.

Given that subject, you might wonder why I’ve titled this post “Meaning in the framework of corpus linguistics.” The answer is that corpus linguistics has not only provided a methodology for investigating meaning, it has also generated important insights about word meaning. (That was the subject of the paper I presented at the BYU symposium in February, which will be published, along with the other papers from the symposium, in a special issue of the BYU Law Review.) I’ll draw on those insights when I talk about frequency analysis, and I thought it would be helpful to make them explicit.

THERE ARE A VARIETY OF DIFFERENT WAYS to think about word meanings. One of them is the way that I see as characteristic of how lawyers and judges tend to think: the meaning of a word is more or less equated with its dictionary definition, and then the definition is in effect read into the statute. If you’ve read a lot of cases, you’ll probably recognize the pattern:

The issue here is what “flood” means. Webster’s Dictionary defines “flood” to mean, “a great flow of water over what is usually dry land.” Therefore, the plaintiffs must show that the water in their basement resulted from a great flow of water over what is usually dry land.

Under this approach, the dictionary entry is treated as if what it defines is the concept flood rather than the word flood. The dictionary entry is being used as stating the conditions determining whether a particular instance of water on the ground qualifies as a flood. Considering the role that dictionaries have come to play in legal interpretation, it is no small irony that many lexicographers would say that the definitions they write aren’t intended to serve that purpose.

Some comments on Hessick on corpus linguistics (updated)

UP UNTIL NOW, the use of corpus linguistics in legal interpretation has gotten almost entirely good press—probably because almost all the press it’s gotten has come from its advocates. That situation has now changed, though, with the posting on SSRN of a paper by UNC law professor Carissa Hessick, who was one of the participants at the BYU law-and-corpus-linguistics symposium this past February. (Hessick has blogged about her paper at Prawfsblawg, here and here.)

The paper, “Corpus Linguistics and the Criminal Law” (pdf), argues that corpus linguistics “is not an appropriate tool” for interpreting statutes. Although it deals specifically with using corpus linguistics in interpreting criminal statutes, and Hessick’s concerns may not be as strong as to other areas of the law, much of her criticism would apply across the board. In this post I am going to discuss some of the issues that the paper raises, and if you’ve followed this blog before, you won’t be surprised to find out that I disagree with Hessick’s conclusion.

The semantics of sleeping in railway stations


“I really, really like the work in Congress, I really do, but I love my family more. People may try to make it more than that, but it’s really that simple,” Chaffetz said on MSNBC. “I just turned 50. I’m sleeping on a cot in my office.”

Chaffetz On No 2018 Run: ‘I Just Turned 50, I’m Sleeping On A Cot In My Office,’ Talking Points Memo xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxx

Everyone familiar with the academic literature on statutory interpretation is aware of the vehicles-in-the-park hypothetical. It was formulated by the legal philosopher H.L.A. Hart to illustrate the argument that the words in which a law is written must have “a core of settled meaning”—a set of standard instance in which no doubts are felt about [the law’s] application”—but will also have “a penumbra of debatable cases in which words are neither obviously applicable nor obviously ruled out.” Harvard law professor Lon Fuller denied the existence of any core area in which the law’s applicability was clear; for Fuller, the law’s applicability turned not on linguistic semantics but on the law’s purpose. Thus, he asked whether, under the hypothesized prohibition against vehicles in the park, “mount[ing] on a pedestal in the park a truck used in World War II…in perfect working order” would fall within the law’s core or its periphery.

Less well known is a separate hypothetical offered by Fuller to support his challenge to  Hart. Fuller posits a law making it a misdemeanor “to sleep in any railway station.” He then supposes that two people have been arrested for violating this law: one who dozed off while waiting for a train, and another “who had brought a blanket and pillow to the station and had obviously settled himself down for the night[,]” but who had been arrested before he fell asleep. “Which of these cases,” Fuller asked, “presents the ‘standard instance’ of the word ‘sleep’?” And would it be faithful to the law to say that the law had been violated by the second person but not the first?

The hypothetical is thought-provoking because applying what is assumed to be the literal meaning of the law—that it prohibits being asleep in a railway station—would yield a conclusion that seems nonsensical: that the law was violated by the dozing passenger but not by the guy who was bedded down but still awake. The hypothetical has been discussed by some very smart legal scholars and philosophers over the years, including Kent Greenawalt, Fred Schauer, John Manning, Scott Soames, and Andrei Marmor, and with few exceptions (mainly Robyn Carston) they have accepted that assumption. Schauer put it as well as anyone: “Sleep is a physiological state, and as a matter of physiology Fuller’s businessman was sleeping. Period.”

But in fact (you can guess where this is going, can’t you?), the assumption’s validity is doubtful at best. It is entirely consistent with actual usage to use sleep in a railway station to mean ‘use a railway station as a place to sleep’ rather than ‘be asleep in a railway station.’

Replying to McGinnis and Rappaport

As I’ve noted, John McGinnis and Mike Rappaport have responded to my post “The language of the law” is not actually a language. They disagree with what I said, and in this post I will return the favor.

McGinnis and Rappaport make two basic points. First, they say that I did not address their argument that The Language of the Law is a technical language and that as a result there is a gap in my analysis. Second, they dispute my argument that the rules of legal interpretation are not analogous to the cognitive processes that underlie comprehension.

The Language of the Law as a technical language

McGinnis and Rappaport accept the point in my earlier post that legal language—or The Language of the Law, to use their preferred term—is not a full-blown language comparable to Hindi or Pirahã. Their paper recognizes that The Language of the Law is “not wholly independent of ordinary language”, and they describe it as “an overlay on ordinary language.” And they don’t take issue with the statement by Peter Tiersma that I quoted: “If we isolate what is distinctive in legal English, leaving out features of ordinary speech, what remains is far too incomplete to function as a language.”

This is important because McGinnis and Rappaport also don’t disagree with my statement that the strong version of their analysis (meaning the version that assumes a “wide conception” of language) relies on their analogy between The Language of the Law and ordinary language. So in order to defend that portion of their argument, it’s essential for McGinnis and Rappaport to show that their analogy is valid. I don’t think they’ve done so.

More on The Language of the Law

Over at Volokh Conspiracy, Will Baude has commented on my post about the language of the law. Will and his co-author Steve Sachs recently had a paper titled “The Law of Interpretation” published as a lead article in the Harvard Law Review. They have a view of the rules of legal interpretation that differs from McGinnis and Rappaport’s and is fairly similar to mine:

In that piece, we argue that some interpretive rules are linguistic ones, elements of our written language, but others, maybe many, are legal ones. Rather than assimilating them to rules of language, we analogize them to other legal defaults, many of which are unwritten, such as the rules for mens rea or accomplice liability in criminal statutes. Seeing such rules as law, not language, avoids critiques like Goldfarb’s that legal rules don’t operate in the way that he says that languages generally operate.

However, Baude sees his (and Sachs’s) conception of legal rules as differing somewhat from my description of legal interpretation as process of explicit reasoning:

I take [Goldfarb’s] point about how introspection might differ for language and for law, but we are not committed to the view that all legal interpretive rules entail a “deliberative process by which the interpreter consciously thinks about how the utter­ance or text should be understood.” Trained lawyers may well use the mens rea canon without really thinking about it. And we affirmatively disagree with the suggestion legal interpretive rules must be “promulgated explicitly by actors vested with institutional authority.” We think such rules can, and often do, exist as part of the general common law backdrops of our legal system — authoritative rules of custom that have never been explicitly promulgated by any lawmaker in particular.

I’ll deal with these points in reverse order.

Corpus linguistics coming to the Sixth Circuit bench? (Plus LAWnCorpusLing roundup)

Adam Liptak reports in the New York Times that President Trump will announce a number of nominations to the lower federal courts, and that one of them is Justice Joan L. Larsen of the Michigan Supreme Court, who will be nominated to the United States Court of Appeals for the Sixth Circuit.

That caught my eye, because in June 2016, the Michigan Supreme Court became the first state supreme court in the country to expressly approve the use of corpus linguistics in statutory interpretation. Continue reading

“The language of the law” is not actually a language

THE NATURE OF LEGAL LANGUAGE has been a recurring subject of discussion, within applied linguistics and (U.S.) legal academia. The latest contribution to that discussion is a recently-posted draft paper by John McGinnis and Michael Rappaport, titled The Constitution and the Language of the Law. (h/t Legal Theory Blog)

McGinnis and Rappaport are the primary advocates of an approach to constitutional interpretation known as original-methods originalism, under which courts today are to apply the interpretive methods that prevailed at the time of the framing (pdf). Their new paper argues that original-methods originalism is supported by the fact that (as they see it), the Constitution is written in “the language of the law.”

Although Larry Solum, of Legal Theory Blog, calls the paper “important and brilliant,” I’m afraid that I find its primary argument to be pretty seriously flawed. [UPDATE: McGinnis and Rappaport have responded to this post, and I have replied to their response.]

I’m going to talk here about two related aspects of the paper that I think are problematic. One is its treatment of “the language of the law” (a phrase that I will henceforth capitalize whenever I use it in the way that McGinnis and Rappaport do). McGinnis and Rappaport come close to treating The Language of the Law as a full-blown language on the order of French and Japanese, which I don’t think is justified by the facts. The other major problem that I see lies in the analogy that the paper draws between the rules of legal interpretation and what it calls the “interpretive rules” of ordinary language (which are better described as the cognitive processes involved in the comprehension of utterances and texts). This analogy, which plays a key role in McGinnis and Rappaport’s argument, is invalid because each of the things that they are analogizing is fundamentally dissimilar from the other.

