[Milton-L] Culturomics? Genome?
rastrier at uchicago.edu
Fri Dec 17 12:25:07 EST 2010
YES, INDEED. SOMETHING SHOULD BE DONE TO COMBAT SUCH NONSENSE.
WHY NOT WRITE A LETTER TO THE GUARDIAN THAT LOTS OF US COULD SIGN?
---- Original message ----
>Date: Fri, 17 Dec 2010 10:53:52 -0500
>From: milton-l-bounces at lists.richmond.edu (on behalf of "Thomas H. Luxon"
<Thomas.H.Luxon at dartmouth.edu>)
>Subject: [Milton-L] Culturomics? Genome?
>To: John Milton Discussion List <milton-l at lists.richmond.edu>
>Cc: Richard Strier <rastrier at midway.uchicago.edu>, "C. Robertson McClung"
<C.Robertson.McClung at dartmouth.edu>, Aden Evens
<Aden.Evens at dartmouth.edu>, Flanagan <Mary.Flanagan at dartmouth.edu>,
Katharine Conley <Katharine.Conley at dartmouth.edu>,
Mary at koko.richmond.edu
>I read this in today's Guardian about two "culturomics" researchers at Harvard
who are using Google data and $ to study the English language "genome":
>"In their initial analysis of the database, the team found that around 8,500 new
words enter the English language every year and the lexicon grew by 70%
between 1950 and 2000. But most of these words do not appear in dictionaries.
"We estimated that 52% of the English lexicon – the majority of words used in
English books – consist of lexical 'dark matter' undocumented in standard
references," they wrote in the journal Science (the full paper is available with
free online registration)."
>Let's talk a bit about terms like "culturomics" and "genome" and the apparent
need to sound like a scientist (a wacky scientist at that) in order to be taken
seriously by the media and govt grant dispensers these days.
>But first, let me try to cast some doubt on the notion that 52 % of the English
lexicon (as represented by 4 % of the books ever published in English) the
majority of words used in English books do not appear in any dictionaries or
other reference books. This claim falls so far outside my experience as a reader
and dictionary user that I want say. Are you kidding? Maybe their computer
algorithm is good at searching a word database and very very poor at using a
dictionary. I suspect that their search algorithm (Harvard's, not Google's) fails to
allow for any sort of conjugation and inflection, so, for example, the word,
"indirectly" comes up as "dark matter." Is this the future of high-funded digital
humanities? What can we do about this?
>Cheheyl professor and Director
>Dartmouth Center for the Advancement of Learning
>Professor of English
>Milton-L mailing list
>Milton-L at lists.richmond.edu
>Manage your list membership and access list archives at
>Milton-L web site: http://johnmilton.org/
More information about the Milton-L