[Cross-posted at Language Log.]
I’d imagine that most people who’ve been actively involved with corpus linguistics are familiar with the BYU corpora—a collection of web-accessible corpora created by Brigham Young University linguistics professor Mark Davies. These corpora (and BYU’s corpus-linguistics program more generally) have played an essential part in the development of what I’ll call the corpus-linguistic turn in legal interpretation. The BYU corpora served as my entry-point into corpus linguistics, and they have provided the corpus data that has been used in most of the law-and-corpus-linguistics work that has been done to date. And beyond that, the BYU Law School has played an enormous role, in a variety of ways, in Law and Corpus Linguistics becoming a thing.
One of the things that the law school has been doing has been happening largely behind the scenes. For the past two or three years, people there have been developing the Corpus of Founding Era American English (COFEA)—a historical corpus that is intended as resource for studying language usage in the time leading up to the drafting and ratification of the U.S. Constitution. At this year’s conference on law and corpus linguistics (the third such conference, all of them hosted by the BYU Law School), we were given a preview of COFEA. And via a tweet by the law school’s dean, Gordon Smith, I’ve now learned that a beta version of COFEA is up and available for public playing-around-with, as are beta versions of two other corpora: the Corpus of Early Modern English and the Corpus of Supreme Court of the United States.
Over about the past year, there’s been a significant increase in the attention being paid to the idea of using corpus linguistics in legal interpretation. One of the most recent developments has occurred in a case that will be argued next week in the Supreme Court, in which two of the amicus briefs rely on corpus linguistics (Brief of Scholars of Corpus Linguistics; Brief of Prof. Jennifer L. Mascott).
The case in question is Lucia v. Securities and Exchange Commission, and it raises the question whether federal Administrative Law Judges are “officers of the United States” within the meaning of the Appointments Clause of the Constitution. This is the first of what will be two or three posts that are prompted by the filing of these briefs. However, none of the posts will deal with the substance of the legal or linguistic issues in the case.
Lucia is the first Supreme Court case I’m aware of in which anyone has relied on corpus analysis since FCC v. AT&T, Inc., in which I filed an amicus brief that was largely corpus-based. It’s also as far as I know the only case in any court where corpus analysis has been used in a brief in connection with an issue of constitutional interpretation.
Carissa Hessick and I have been debating the appropriateness of using empirical methods in legal interpretation. The debate began on PrawfsBlawg, then moved over here (with some continued discussion at Prawfs), and then spread to Twitter. The relevant tweets are collected in my previous post, and in this post I’ll respond to Hessick’s most recent points.
As I understand her, Hessick contends that the issue of ordinary meaning isn’t an “empirical question” because the question of how a reasonable person would understand the text is inherently qualitative rather than quantitative, and therefore can’t be answered in a way that is “provable or verifiable.” I accept Hessick’s characterization of the ordinary-meaning issue as being qualitative rather than quantitative, but it doesn’t follow that quantitative information is always irrelevant.
My last post, Corpus linguistics: Empiricism and frequency, prompted a Twitter exchange between Carissa Hessick and me, a lightly edited version of which I present here.
One question based on my quick read: Do you think most people would understand “relying on linguistic intuition” to be an empirical undertaking? I appreciate the insight into how people’s linguistic intuitions are formed. But don’t most people think that, if something is an empirical question, that means there is a demonstrably correct answer?
And if we often have different intuitions about what a word means (as the split decisions on ordinary meaning illustrate), and if judges resolve the Q of ordinary meaning by consulting their own intuitions, then how can ordinary meaning be an empirical Q? If I have one intuition and you have another, then how to we demonstrate which is correct and which is incorrect?
Me: Continue reading
This is the second in a series of posts about the essentially final version of Carissa Hessick’s article Corpus Linguistics and the Criminal Law. The first post dealt mainly with Hessick’s views about how corpus linguistics relates to ultimate purpose of legal interpretation, which is to determine the legal meaning of the text in dispute. This time around, I’ll be discussing her claim that incorporating corpus linguistics into legal interpretation would radically transform the process of determining the text’s ordinary meaning:
Corpus linguistics reframes the “plain” or “ordinary” meaning inquiry in two ways. First, it claims that ordinary meaning is an empirical question. Second, it tells us that this empirical question ought to be answered by how frequently a term is used in a particular way. Both of these analytical moves represent significant departures from current theories of statutory interpretation, including textualism, and they render statutory interpretation essentially unrecognizable.
This statement is a mixed bag. In one respect, it’s correct. Those who support the use of corpus linguistics in legal interpretation do regard ordinary meaning as an empirical question—or at least as involving empirical questions. In a different respect, it is partly correct but oversimplified. Analysis of frequency data is in fact central to corpus linguistics, but it is not necessarily decisive, and in some cases (perhaps in many cases) it will not be helpful at all. And in a third respect, Hessick’s statement is wrong. Neither the empiricism of corpus linguistics nor the attention it pays to frequency represents a “significant departure” from existing interpretive theories.
Empiricism Continue reading
I have two pieces of news I want to share.
First, I am very excited to say that I have received an appointment by the Georgetown University Law Center (aka Georgetown Law) as a Dean’s Visiting Scholar.
That appointment will provide me with a platform from which I’ll continue and expand on the kind of work that I’ve been doing here at LAWnLinguistics, in the amicus briefs in which I’ve drawn on linguistics, and in my paper A Lawyer’s Introduction to Meaning in the Framework of Corpus Linguistics: developing and promoting the idea that part of what it means to think like a lawyer is learning how to think like a linguist.
Sometimes, it’s immediately obvious from the opinions that a case raises questions about interpretation that are interesting, important, or both. Smith v. United States, in which the question was whether trading a handgun for drugs amounts to “using” it, is a classic example. At first glance, the Supreme Court’s decision in Artis v. District of Columbia doesn’t seem to be in that category. It doesn’t offer interesting linguistic issues that call attention to themselves, except for someone who is familiar with the work of the linguist John Sinclair and the lexicographer Patrick Hanks. But with some digging, Artis yields some issues that I think are interesting and significant, having to do with new approaches to analyzing questions of word meaning and with how not to use dictionaries.
Posted in "toll" (v.), Alito, Artis v. District of Columbia, Corpus linguistics & lexicography, Corpus linguistics and statutory interpretation, Dictionaries, Ginsburg, Gorsuch, Law and corpus linguistics, Law and linguistics, Word meaning