nl-uiuc AT lists.cs.illinois.edu

Subject: Natural language research announcements

List archive

[nl-uiuc] FW: Data Science Summer Institute Talk, Shaul Markovitch, July 11th at 9:00

From: "Hockenmaier, Julia Constanze" <juliahmr AT cs.uiuc.edu>
To: "nl-uiuc AT cs.uiuc.edu" <nl-uiuc AT cs.uiuc.edu>
Subject: [nl-uiuc] FW: Data Science Summer Institute Talk, Shaul Markovitch, July 11th at 9:00
Date: Wed, 9 Jul 2008 12:24:36 -0500
Accept-language: en-US
Acceptlanguage: en-US
List-archive: <http://lists.cs.uiuc.edu/pipermail/nl-uiuc>
List-id: Natural language research announcements <nl-uiuc.cs.uiuc.edu>

And another DSSI talk on Friday!

------ Forwarded Message
From: "Schaefer, Melinda M"
<mschaefr AT cs.uiuc.edu>
Date: Wed, 9 Jul 2008 09:33:44 -0500
To:
<ifaculty AT cs.uiuc.edu>,

<cs-grads AT cs.uiuc.edu>
Cc:
<mschaefr AT uiuc.edu>,
"King, Robin Brian"
<rbking AT uiuc.edu>
Conversation: Data Science Summer Institute Talk, Shaul Markovitch, July
11th at 9:00
Subject: Data Science Summer Institute Talk, Shaul Markovitch, July 11th at
9:00

University of Illinois at Urbana-Champaign

Department of Computer Science
The Thomas M. Siebel Center for Computer Science
201 North Goodwin Avenue
Urbana, Illinois 61801-2302 USA

Data Science Summer Institute Talk

The Knowledgeable Computer: Using Wikipedia-based Semantics for Text
Processing

Shaul Markovitch, Faculty Member
Computer Science Department
Technion Israel Institute of Technology
Friday, July 11, 2008 at 9:00 A.M.
2405 Siebel Center for Computer Science

Abstract:
When humans perform text-processing tasks, such as text categorization,
information retrieval and finding related documents, they interpret the
specific wording of the document in the much larger context of their
background knowledge and experience. On the other hand, state-of-the-art
text processing programs are quite brittle - they mostly rely on the
frequency of word occurrences without using common-sense knowledge.

We propose to enrich document representation through automatic use of a vast
compendium of human knowledge - an encyclopedia. We define a new type of
Wikipedia-based semantics that uses the collection of Wikipedia articles as
an ontology. Every Wikipedia article represents a concept. Every word or
text fragment is represented as a point in the multi-dimensional space of
this concept space.

When performing text-processing tasks, such as text categorization, we
enrich the processed documents with Wikipedia concepts, thus allowing a much
more knowledgeable inference. Empirical evaluation of our method in the
context of text categorization, information retrieval and computing semantic
relatedness shows that such knowledge-intensive representation can indeed
enhance performance in these domains significantly.

This work is done is collaboration with Ofer Egozi Evgeniy Gabrilovich.

Homepage: http://www.cs.technion.ac.il/~shaulm/

Melinda Schaefer
Department of Computer Science
University of Illinois, Urbana/Champaign
201 N. Goodwin Ave
2232 Siebel Center, MC-258
Urbana, IL 61801
(217)333-6454
mschaefr AT uiuc.edu

------ End of Forwarded Message

[nl-uiuc] FW: Data Science Summer Institute Talk, Shaul Markovitch, July 11th at 9:00, Hockenmaier, Julia Constanze, 07/09/2008