Whoa

I’m late in learning about this; it apparently went public  back in May, but doesn’t seem to have attracted much blogospheric notice.

Mark Davis, the proprietor of the Corpus of Contemporary American English (COCA) and the Corpus of Historical American English (COHA), has made another corpus available via the same interface as COCA and COHA. This one’s a little bit bigger though.

155 billion words, 62 billion of them the 1980s-2000s.

Billion with a b.  Bill-yun.

Update: Apparently some of the features of the interface aren’t available yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s