Words, Meanings, Corpora: A Lawyer’s Introduction to Meaning in the Framework of Corpus Linguistics

On Friday I will be presenting a paper at a conference at Brigham Young University Law School on law and corpus linguistics. Here is the description from the conference website:

Building on the 2016 inaugural Law and Corpus Linguistics Conference, the 2017 BYU Law Review Symposium, “Law & Corpus Linguistics” brings together legal scholars from across various substantive areas of scholarship, prominent corpus linguistics scholars, and judges who have employed corpus linguistics analysis in their decisions.

Although there’s a link on the webpage for the papers that will be presented, they are password-protected. However, my paper is posted on SSRN and can be downloaded there. It is titled Words, Meanings, Corpora: A Lawyer’s Introduction to Meaning in the Framework of Corpus Linguistics, and the abstract is below the fold.


Corpus linguistics has been promoted as a new tool for legal interpretation that provides an alternative to dictionaries. But that is not its only significance. In addition to providing new methodologies, corpus linguistics (and in particular corpus-based lexicography) provides important insights about the nature of word meaning, and about the interpretation of words in context. These insights (by linguists and lexicographers such as John Sinclair, Patrick Hanks, Sue Atkins, and Adam Kilgarriff) challenge the assumptions that underlie the lawyers’ and judges’ analyses of word meaning.

As one might expect given the centrality of dictionaries in disputes over word meaning, legal interpretation presupposes a view of word meaning that is essentially the same as the view that is fostered by dictionaries. Under this view, individual words are the basic units of meaning from which the meanings of sentences are built. Word meanings are seen as discrete entities with (in most cases) clear boundaries.

But corpus linguistics and corpus-based lexicography have shown that the reality is different. Clear boundaries between the meanings of different words, or between the different senses of the same word, often do not exist. Drawing lines between different word senses often has an unavoidable element of arbitrariness, as is shown by the fact that the lines are often drawn differently by different dictionaries. These differences raise questions about the validity of legal interpreters’ relying on dictionaries at all, and at a minimum suggest the need for changes in how dictionaries are used.

Corpus linguistics and corpus-based lexicography have also cast doubt on the view (which most people would regard as simple common sense) that words are the basic unit of meaning, and that the meaning of a sentence can be computed by applying the rules of grammar to the meaning of the individual words. It turns out that in many cases, it makes more sense to regard multiword expressions as the basic units of meaning. The meaning of the whole often differs from the sum of the meanings of the words, in part because a word’s meaning in context can be affected by the words it co-occurs with and the grammatical constructions it is part of.

As a result of these insights, corpus linguistics opens up new ways of thinking about word meaning—which translates into new modes of argumentation and analysis. To illustrate the possibilities, I will take a fresh look at Muscarello v. United States, 524 U.S. 125 (1998), which presented the question whether driving a car or truck with a firearm in the trunk or glove compartment amounted to “carrying” the firearm. Although Muscarello has already been the subject of a corpus-based analysis by Steven Mouritsen, his analysis focused on which of two dictionary senses of the word carry was more common, and therefore assumed the conception of word meaning that is generally reflected in legal interpretation. My approach will differ from Mouritsen’s in two respects. First, rather than look only at which one of two senses is more common, I will ask a more open-ended question: when viewed without preconceptions, what does the corpus data tell us about how the word carry behaves? Second, I will look at the data through the lens of Corpus Pattern Analysis, a corpus-driven lexicographic approach that focuses on multiword patterns rather than on individual word meanings.

  Reminder: Don't Rely on Dictionaries - Adams on Contract Drafting

