Skip to Content.
Sympa Menu

illinois-ml-nlp-users - [Illinois-ml-nlp-users] Missing out on many potential entities using Curator / Wikifier

illinois-ml-nlp-users AT lists.cs.illinois.edu

Subject: Support for users of CCG software closed 7-27-20

List archive

[Illinois-ml-nlp-users] Missing out on many potential entities using Curator / Wikifier


Chronological Thread 
  • From: Bahareh Sarrafzadeh <bsarrafz AT uwaterloo.ca>
  • To: illinois-ml-nlp-users AT cs.uiuc.edu
  • Subject: [Illinois-ml-nlp-users] Missing out on many potential entities using Curator / Wikifier
  • Date: Sun, 27 Jan 2013 19:54:39 -0500
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/illinois-ml-nlp-users/>
  • List-id: Support for users of CCG software <illinois-ml-nlp-users.cs.uiuc.edu>

Hi,

We are using Curator to extract entities by combining the outputs of Illinois NER and Wikifier. As I look through the annotated text many NPs were not tagged as entities although they have a corresponding page in Wikifier.

For example for the following text:

House Of Our Dreams .
We had purchased our first home while we were engaged.
It was a fantastic two story - four bedroom subdivision home in a great neighbourhood with lots of kids.
It was situated close to all kinds of convenient amenities, a lake, beautiful parks and great schools - all within walking distance.
All four of our children were born while we lived in that home.
It was a great place to live and raise a family and it will always hold incredible memories for both of us.
It was not, however, the house of our dreams.

"lake" is extracted as an entity by Wikifier, while "parks", "schools", "house" are not. And all of them have corresponding pages in Wikipedia. 

Using the Wikifier demo online, all these words are highlighted as 'bold'. Were they entity candidates but for some reason were not linked to Wikipedia? 

This is the output of Curator for "lake", "parks" and "schools" :

Term from text: 'lake'
Properties: 
RankerScore, 0.6546164451802008; 
IsLinked, true; 
SurfaceFormWikiCatAttribs, lake; 
TitleWikiCatAttribs, lake wetland; 
LinkerScore, 0.8995801379665003; 
----------------------
Term from text: 'parks'
Label: UNMAPPED
Properties: 
RankerScore, -999.0; 
IsLinked, false; 
SurfaceFormWikiCatAttribs, parks; 
TitleWikiCatAttribs, ; 
LinkerScore, -999.0; 
----------------------
Term from text: 'schools'
Label: UNMAPPED
Properties: 
RankerScore, -999.0; 
IsLinked, false; 
SurfaceFormWikiCatAttribs, school; 
TitleWikiCatAttribs, ; 
LinkerScore, -999.0; 
----------------------

Reading the paper didn't help me realize why these nouns were not mapped to Wikipedia. I appreciate your clarification.

Thanks,
Bahar



Archive powered by MHonArc 2.6.16.

Top of Page