Monthly Archives: August 2011

Whoa

I’m late in learning about this; it apparently went public¬† back in May, but doesn’t seem to have attracted much blogospheric notice.

Mark Davis, the proprietor of the Corpus of Contemporary American English (COCA) and the Corpus of Historical American English (COHA), has made another corpus available via the same interface as COCA and COHA. This one’s a little bit bigger though.

155 billion words, 62 billion of them the 1980s-2000s.

Billion with a b.  Bill-yun.

Update: Apparently some of the features of the interface aren’t available yet.