nl-uiuc AT lists.cs.illinois.edu

Subject: Natural language research announcements

List archive

[nl-uiuc] Reminder: AIIS talk by Joseph Keshet

From: Rajhans Samdani <rsamdan2 AT illinois.edu>
To: nl-uiuc AT cs.uiuc.edu, aivr AT cs.uiuc.edu, dais AT cs.uiuc.edu, cogcomp AT cs.uiuc.edu, vision AT cs.uiuc.edu, eyal AT cs.uiuc.edu, aiis AT cs.uiuc.edu, aistudents AT cs.uiuc.edu, "Girju, Corina R" <girju AT illinois.edu>
Subject: [nl-uiuc] Reminder: AIIS talk by Joseph Keshet
Date: Fri, 12 Nov 2010 12:07:57 -0600 (CST)
List-archive: <http://lists.cs.uiuc.edu/pipermail/nl-uiuc>
List-id: Natural language research announcements <nl-uiuc.cs.uiuc.edu>

Hi all,

A gentle reminder for today's AIIS seminar.

When: Nov 12, 2 pm.

Where: 3405 SC.

Speaker: Joseph Keshet (http://ttic.uchicago.edu/~jkeshet/)

Title: Direct Loss Minimization for Structured Prediction with Applications
to Speech
Recognition

Abstract:
In discriminative machine learning one is interested in training a system to
optimize a
certain desired measure of performance, or loss. In binary classification one
typically
tries to minimizes the error rate. But in structured prediction each task
often has its
own measure of performance such as the BLEU score in machine translation or
word
error rate in speech recognition. The most common approaches to structured
prediction, structural SVMs and CRFs, do not minimize the task loss: the
former
minimizes a surrogate loss with no guarantees for task loss and the latter
minimizes
log loss independent of task loss. In the first part of the talk we present a
theorem
stating that a certain perceptron-like learning rule, involving features
vectors derived
from loss-adjusted inference, directly corresponds to the gradient of task
loss.
Empirical results on phonetic alignment are presented surpassing all
previously
reported results on this problem on the TIMIT corpus.

In the second part of the talk, we describe a new algorithm which aims at
minimizing
the regularized task loss. We state a PAC-Bayesian generalization bound,
which
gives an upper-bound on the expected task loss in terms of the empirical task
loss.
Our algorithm is derived by finding the gradient of the PAC-Bayesian bound
and
minimizing it by stochastic gradient descent. The resulting algorithm is
iterative and
easy to implement. Experiments on phoneme recognition on the TIMIT corpus,
when
the task loss is chosen to be the phoneme error rate, show that our method
achieves
the lowest phoneme error rate compared to other discriminative and generative
models
with the same expressive power.

Joint work with David McAllester and Tamir Hazan.

Thanks!
Best,
Rajhans

Rajhans Samdani,
Graduate Student,
Dept. of Computer Science,
University of Illinois at Urbana-Champaign.

[nl-uiuc] Reminder: AIIS talk by Joseph Keshet, Rajhans Samdani, 11/12/2010