edmond lau

Learning Lexical Clusters in Children's Books

April :: May 2004 -- [executive summary] [presentation] [code]

Intuitively, from hearing a phrase such as "Cinderella's glass slippers," a child may also strengthen her confidence to use a related phrase such as "Barbie's plastic slippers." If we are to understand how human beings acquire and use knowledge about language, we need to examine the role of statistical regularity in language learning. As a first step toward tackling this problem for my 6.xxx (Human Intelligence Enterprise) final project, I proposed the concept of lexical clusters -- collections of words with statistically similar linkage structures with other words -- and implemented a clustering algorithm in Java to explore the role that lexical clusters may play in language learning. For example, from parsing the phrase "the city mouse and the country mouse," the algorithm grouped "city" and "country" in the same lexical cluster because they both modify the word "mouse" in the same manner.

Copyright © 2006 Edmond Lau