Numbers of Words in WSJ

The Wall Street Journal (WSJ) has a column devoted to statistics called The Numbers Guy by Carl Bialik, which recently talked about words. Making Every Word Count is an update of statistics about Corpus Linguistics, the study of using large groups of words (Corpus) to analyze language (Linguistics). The science has been really heating up in the last decade, with a some very nice dictionaries based on Corpora.

Computers have spawned a burst of activity in the field. But even computers don’t suffice for the daunting task of word collecting and counting. Brown University’s one-million-word corpus was considered adequate in the 1960s. Today, the 100-million-word British National Corpus is considered small — and dated — because it preceded the Internet era, and other sources of new language.

The problem these days is that verbal speech costs about 5 times what text speech does to collect. And with the larger corpora necessary for finer distinctions of language, the cost becomes prohibitive.


Edge: John Brockman: Clay Shirky: Voting Republican

I love summer vacation. I get to sit and read stuff I would never “have time for” during regular work. Case in point, the Edge, an online magazine by author John Brockman, my favorite non-fiction author (since The Third Culture). The last two articles have been great. The most recent is What makes people vote Republican?

In it we get an author describing how elections are about perceptions of morality, and part of our belief systems may even be genetic. And how the Democrats and especially the academics and intellectuals, don’t get it.

Even better is the commentary, by a linguist/missionary to the Amazon, a Harvard psychologist and inventor of Multiple Intelligences, a Skeptic, a neurologist (my favorite) another anthropologist, and an Artificial Intelligence (AI)/education specialist (one of my favorite thinkers). This is the kind of thing you sit around the table after dinner talking about.

And there’s more. One of my favorite books this year is Clay Shirky’s Here Comes Everybody. It is about how reduced transactional costs (you do this for me and I’ll do that for you) on the Internet have changed culture as we understand it, starting with business but extending into every part of our lives.

In Edge, he speaks (see the video, scroll down about 3 articles) about Gin, Television, and Cognitive Surplus. The main idea is that while gin allowed us to make the transfer from an agrarian to urban society by acting as a buffer while appropriate intuitions were established (libraries, streets), it is TV that takes that role in our switch from industrial to informational society. And like gin, if we can get “off” the Gilligan’s Island, there is a huge potential of cultural resource available in the form of people’s time. See the video (again, scroll down) and then read the comments, you will find out that there is never any reason for saying, “I don’t have enough time,” and how our next generation will never be satisfied with sitcoms.

I was having dinner with a group of friends about a month ago, and one of them was talking about sitting with his four-year-old daughter watching a DVD. And in the middle of the movie, apropos nothing, she jumps up off the couch and runs around behind the screen. That seems like a cute moment. Maybe she’s going back there to see if Dora is really back there or whatever. But that wasn’t what she was doing. She started rooting around in the cables. And her dad said, “What you doing?” And she stuck her head out from behind the screen and said, “Looking for the mouse.”

Here’s something four-year-olds know: A screen that ships without a mouse ships broken.

I’ll be teaching kids like that in a few years.



cck08 about to start

One of the blogs I read (I think it was Robin Good’s) sent me to cck08, an open online class in Connectivism. I’ve been curious about this concept of distributed knowledge and its use in learning and education. It is coming at just the right time. George Siemens posts in blogs I read regularly, and I am looking forward to interacting with what look to be a huge group (200 have sent in self- introductions, 2,000 have expressed an interest). Can’t wait. Look for my reactions here and in their class Moodle. and on twitter, I’m TokyoKevin.