[Milton-L] Culturomics? Genome?

Horace Jeffery Hodges jefferyhodges at yahoo.com
Fri Dec 17 18:38:23 EST 2010

Sara van den Berg wrote:
"In a recent issue of the New York Times, William Safire eulogized four major 
lexicographers (including Fred Mish) who died during this past year."
But hasn't Safire himself been dead for a couple of years?
Jeffery Hodges

From: Sara van den Berg <vandens at slu.edu>
To: John Milton Discussion List <milton-l at lists.richmond.edu>
Sent: Sat, December 18, 2010 1:21:20 AM
Subject: Re: [Milton-L] Culturomics? Genome?

Our colleagues who are professional lexicographers (at G.C. Merriam and other 
companies, as well as the professional scholarly society of lexicographers-) 
spend a great deal of time gathering new words and new usages that enter written 
English.  The late Fred Mish, who was the editor at G.C. Merriam, subscribed to 
the "descriptive" concept of such work.  That means he and his group wanted to 
include whatever is actually used, regardless of "rules" of grammar, etc.  I 
read elsewhere about the Harvard researchers, and there was no mention of the 
ongoing work by lexicographers other than the dismissive comment that "most of 
these words do not appear in dictionaries."  The Harvard researchers do not make 
clear the basis of their "estimate."  In a recent issue of the New York Times, 
William Safire eulogized four major lexicographers (including Fred Mish) who 
died during this past year.   

I agree with Tom that the uncritical reliance on "Google data" is very 

Sara van den Berg

On Fri, Dec 17, 2010 at 9:53 AM, Thomas H. Luxon <Thomas.H.Luxon at dartmouth.edu> 

Fellow scholars,
>I read this in today's Guardian about two "culturomics" researchers at Harvard 
>who are using Google data and $ to study the English language "genome":
>"In their initial analysis of the database, the team found that around 8,500 new 
>words enter the English language every year and the lexicon grew by 70% between 
>1950 and 2000. But most of these words do not appear in dictionaries. "We 
>estimated that 52% of the English lexicon – the majority of words used in 
>English books – consist of lexical 'dark matter' undocumented in standard 
>references," they wrote in the journal Science (the full paper is available with 
>free online registration)."
>Let's talk a bit about terms like "culturomics" and "genome" and the apparent 
>need to sound like a scientist (a wacky scientist at that) in order to be taken 
>seriously by the media and govt grant dispensers these days.
>But first, let me try to cast some doubt on the notion that 52 % of the English 
>lexicon (as represented by 4 % of the books ever published in English) the 
>majority of words used in English books do not appear in any dictionaries or 
>other reference books.  This claim falls so far outside my experience as a 
>reader and dictionary user that I want say. Are you kidding?  Maybe their 
>computer algorithm is good at searching a word database and very very poor at 
>using a dictionary. I suspect that their search algorithm (Harvard's, not 
>Google's) fails to allow for any sort of conjugation and inflection, so, for 
>example, the word, "indirectly" comes up as "dark matter."  Is this the future 
>of high-funded digital humanities?  What can we do about this?
>Tom Luxon
>Cheheyl professor and Director
>Dartmouth Center for the Advancement of Learning
>Professor of English
>Milton-L mailing list
>Milton-L at lists.richmond.edu
>Manage your list membership and access list archives at 
>Milton-L web site: http://johnmilton.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.richmond.edu/pipermail/milton-l/attachments/20101217/cf420238/attachment.html

More information about the Milton-L mailing list