nl-uiuc AT lists.cs.illinois.edu

Subject: Natural language research announcements

List archive

[nl-uiuc] Upcoming Talk in the AIIS Seminar

From: Chang Ming-Wei <mchang21 AT uiuc.edu>
To: nl-uiuc AT cs.uiuc.edu, aivr AT cs.uiuc.edu, dais AT cs.uiuc.edu, cogcomp AT cs.uiuc.edu, vision AT cs.uiuc.edu, krr-group AT cs.uiuc.edu, aiis AT cs.uiuc.edu
Subject: [nl-uiuc] Upcoming Talk in the AIIS Seminar
Date: Mon, 30 Mar 2009 21:23:19 +0300
List-archive: <http://lists.cs.uiuc.edu/pipermail/nl-uiuc>
List-id: Natural language research announcements <nl-uiuc.cs.uiuc.edu>

Dear faculty and students,

A Ph.D. candidate of CS department, Alexandre Klementiev, will give a talk (details
below) for the AIIS seminar at 4:00 pm, Apr 2nd (this Thursday). The
room number is 3405. Hope to see you there!

Title:

Learning with Incidental Supervision

Abstract:

Moving toward understanding and automatic generation of natural human languages requires a toolbox of core capabilities. It is well accepted today that it is essentially impossible to manually encode many of these capabilities without the aid of machine learning techniques, which automatically acquire them from available natural language data. Corpus-based supervised learning has emerged as the dominant approach, and it relies crucially on the availability of labeled data. However, while unsupervised data is usually plentiful, its annotation is a laborious process for a number of realistic Natural Language Processing tasks, especially those dealing with structured output spaces.

In this talk, I will argue that it is often possible to derive a surrogate supervision signal from a small amount of background knowledge and often plentiful weakly structured unsupervised data. We call this setting "learning with incidental supervision", and study it in the context of the following tasks. First, we consider the problem of Named Entity (NE) annotation transfer to a resource-poor language in a bilingual corpus. We demonstrate that temporal similarity of NE counterparts across languages can be used as an incidental supervision signal to drive learning of a discriminative transliteration model. Second, we consider the task of unsupervised aggregation of structured output models. We demonstrate for ranked data that agreement between constituent models can serve as an incidental supervision signal sufficient to learn an effective aggregation model.

Bio:

Alexandre Klementiev is a Ph.D. candidate at the UIUC Department of Computer Science working with Prof. Dan Roth. His research interests are on the intersection of Machine Learning and Natural Language Processing. More specifically, he is interested in weakly supervised learning problems in NLP, multilingual information extraction, and information fusion.

[nl-uiuc] Upcoming Talk in the AIIS Seminar, Chang Ming-Wei, 03/30/2009