[Milton-L] Culturomics? Genome?

Sara van den Berg vandens at slu.edu
Fri Dec 17 11:21:20 EST 2010


Our colleagues who are professional lexicographers (at G.C. Merriam and
other companies, as well as the professional scholarly society of
lexicographers-) spend a great deal of time gathering new words and new
usages that enter written English.  The late Fred Mish, who was the editor
at G.C. Merriam, subscribed to the "descriptive" concept of such work.  That
means he and his group wanted to include whatever is actually used,
regardless of "rules" of grammar, etc.  I read elsewhere about the Harvard
researchers, and there was no mention of the ongoing work by lexicographers
other than the dismissive comment that "most of these words do not appear in
dictionaries."  The Harvard researchers do not make clear the basis of their
"estimate."  In a recent issue of the New York Times, William Safire
eulogized four major lexicographers (including Fred Mish) who died during
this past year.

I agree with Tom that the uncritical reliance on "Google data" is very
problematic.

Sara van den Berg

On Fri, Dec 17, 2010 at 9:53 AM, Thomas H. Luxon <
Thomas.H.Luxon at dartmouth.edu> wrote:

> Fellow scholars,
>
> I read this in today's Guardian about two "culturomics" researchers at
> Harvard who are using Google data and $ to study the English language
> "genome":
>
> "In their initial analysis of the database, the team found that around
> 8,500 new words enter the English language every year and the lexicon grew
> by 70% between 1950 and 2000. But most of these words do not appear in
> dictionaries. "We estimated that 52% of the English lexicon – the majority
> of words used in English books – consist of lexical 'dark matter'
> undocumented in standard references," they wrote in the journal Science (the
> full paper is available with free online registration)."
>
> Let's talk a bit about terms like "culturomics" and "genome" and the
> apparent need to sound like a scientist (a wacky scientist at that) in order
> to be taken seriously by the media and govt grant dispensers these days.
>
> But first, let me try to cast some doubt on the notion that 52 % of the
> English lexicon (as represented by 4 % of the books ever published in
> English) the majority of words used in English books do not appear in any
> dictionaries or other reference books.  This claim falls so far outside my
> experience as a reader and dictionary user that I want say. Are you kidding?
>  Maybe their computer algorithm is good at searching a word database and
> very very poor at using a dictionary. I suspect that their search algorithm
> (Harvard's, not Google's) fails to allow for any sort of conjugation and
> inflection, so, for example, the word, "indirectly" comes up as "dark
> matter."  Is this the future of high-funded digital humanities?  What can we
> do about this?
>
> Tom Luxon
> Cheheyl professor and Director
> Dartmouth Center for the Advancement of Learning
> Professor of English
> _______________________________________________
> Milton-L mailing list
> Milton-L at lists.richmond.edu
> Manage your list membership and access list archives at
> http://lists.richmond.edu/mailman/listinfo/milton-l
>
> Milton-L web site: http://johnmilton.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.richmond.edu/pipermail/milton-l/attachments/20101217/d32cff49/attachment.html


More information about the Milton-L mailing list