Natural Language and Artificial Intelligence
HSSP Summer 2009, Instructor: Gregory Marton
July 12 -- First Steps
We will introduce the concepts of syntax, semantics, and meaning, in
both human languages and computer languages. We will take our first
steps in writing a Scheme program, and start to think deeply about
English.
- Computing: numbers, strings, booleans, procedures, conditionals, naming
- Linguistics: nouns, verbs, conditionals
- Lab exercise: get set up with subversion and PLT; Generate simple
sentences that say hello to someone and incorporate an adjective
about them.
July 19 -- Languages, Sounds, Alphabets
We'll look at the world's most common and least common languages, talk
about language change and language families, writing systems and
sounds, and all that makes one language different from another.
- Computing: lists: map and filter, recursion, machine learning, probability
- Linguistics: Languages of the world, language families, sounds and alphabets
- Lab exercise: Automatically identify whether a text is in English or Dutch.
July 26 -- Morphology
What are words and what is the significance of spaces? Languages like
Turkish, Hungarian, and to some extent German, can express entire
sentences in one word. Languages like Chinese have words, but do not
use spaces in writing. Even in English, we can see regularities of
meaning that are smaller than words.
- Computing: associations, map over more than one list, ormap
- Linguistics: morphemes are the smallest units of meaning; regular
and irregular word endings; where words come from
- Lab exercise: Predict regular word endings and thus find the irregular ones.
Aug 2 -- Words and Keyword Search
When we put morphemes together into words, they still don't convey just one meaning:
- Computing: mappings: list->mapping, mapping-reduce, indexing
- Linguistics: homographs and homophones, synonyms, hyponyms, meronyms, stemming
- Lab Exercise: Search Twitter-feeds case-sensitively and
case-insensitively, and with collapsing morphologically different
forms of the same word.
Aug 9 -- Multi-word meanings
- Computing: nested data structures, counting bigrams
- Linguistics: verb-particle constructions, selectional restrictions
- Lab exercise: Extend Twitter search to search for sequences, rather
than sets, of words. Weight common words less than rare words.
Aug 16 -- Syntax and Development
You've heard of subjects and predicates -- where do these come from?
Is there really structure to our sentences, or are they just n-grams,
like the multi-word entities. If we're careful, we can unambiguously
hear that there is structure in a sentence, but it's not easy, so how
to do kids figure it out? What else do kids invent that they never
hear?
- Computing: visualization -- turning mappings into scatter plots and bar graphs
- Linguistics: syntax, wanna contraction, clefting for constituency,
coordination for constituent type, overregularization
- Lab exercise: We will use our search engine to search for incorrect
irregular verb endings in actual transcripts of children's speech at
various ages. We'll make graphs that show children's very odd
learning patterns.
Aug 23 -- Semantics and Grounding
Guest Lecturer: Stefanie Tellex
In order for computers to really understand anything of our language,
we and they must share some of the same kinds of experiences.
Computers are learning to see, to hear, to move around, and interact
with the world around them, so there is hope. But they need us to
teach them what things mean.
- Computing: automatic evaluation; review: features and machine learning
- Linguistics: semantics of spatial prepositions
- Lab exercise: annotate time and spatial expression examples
Aug 30 -- Meanings as Programs
Many meanings have to be grounded in the real world, and connected to
cameras and motors and experiences, but surely there are some meanings
that computers can more easily relate to, like numbers and time. Can
we get computers to talk and reason about numbers and about time in a
human language? How do we put together what we've learned about
morphemes, words, and structure, and teach a computer to communicate
coherently about a few things?
- Computing: procedures as return values, currying
- Linguistics: Eliza, procedural semantics, CCG, discourse and
coherence, Gricean implicature, brains.
- Lab exercise: write meanings for some simple sentences, course evaluation