Skip to Content.
Sympa Menu

illinois-ml-nlp-users - Re: [Illinois-ml-nlp-users] Missing out on many potential entities using Curator / Wikifier

illinois-ml-nlp-users AT lists.cs.illinois.edu

Subject: Support for users of CCG software closed 7-27-20

List archive

Re: [Illinois-ml-nlp-users] Missing out on many potential entities using Curator / Wikifier


Chronological Thread 
  • From: Lev-Arie Ratinov <arie.ratinov AT gmail.com>
  • To: Bahareh Sarrafzadeh <bsarrafz AT uwaterloo.ca>
  • Cc: illinois-ml-nlp-users <illinois-ml-nlp-users AT cs.uiuc.edu>
  • Subject: Re: [Illinois-ml-nlp-users] Missing out on many potential entities using Curator / Wikifier
  • Date: Mon, 28 Jan 2013 12:17:54 -0500
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/illinois-ml-nlp-users/>
  • List-id: Support for users of CCG software <illinois-ml-nlp-users.cs.uiuc.edu>

Hi Bahareh. Thanks for your interest. Some entities were filtered out from the index based on how often they are mentioned in Wikipedia, what is their linabiklity etc. Additionally, plurals are treated differently. 'Parks' does not refer to a specific park. The bold words (e.g) parks refers to category of the noun. The categories were used in my paper on co-ref with Wikipedia. You can read more details there. If you hover over linked terms in my wikifier demo, the categories will be displayed.

Best

Peace&Love


On Sun, Jan 27, 2013 at 7:54 PM, Bahareh Sarrafzadeh <bsarrafz AT uwaterloo.ca> wrote:
Hi,

We are using Curator to extract entities by combining the outputs of Illinois NER and Wikifier. As I look through the annotated text many NPs were not tagged as entities although they have a corresponding page in Wikifier.

For example for the following text:

House Of Our Dreams .
We had purchased our first home while we were engaged.
It was a fantastic two story - four bedroom subdivision home in a great neighbourhood with lots of kids.
It was situated close to all kinds of convenient amenities, a lake, beautiful parks and great schools - all within walking distance.
All four of our children were born while we lived in that home.
It was a great place to live and raise a family and it will always hold incredible memories for both of us.
It was not, however, the house of our dreams.

"lake" is extracted as an entity by Wikifier, while "parks", "schools", "house" are not. And all of them have corresponding pages in Wikipedia. 

Using the Wikifier demo online, all these words are highlighted as 'bold'. Were they entity candidates but for some reason were not linked to Wikipedia? 

This is the output of Curator for "lake", "parks" and "schools" :

Term from text: 'lake'
Properties: 
RankerScore, 0.6546164451802008; 
IsLinked, true; 
SurfaceFormWikiCatAttribs, lake; 
TitleWikiCatAttribs, lake wetland; 
LinkerScore, 0.8995801379665003; 
----------------------
Term from text: 'parks'
Label: UNMAPPED
Properties: 
RankerScore, -999.0; 
IsLinked, false; 
SurfaceFormWikiCatAttribs, parks; 
TitleWikiCatAttribs, ; 
LinkerScore, -999.0; 
----------------------
Term from text: 'schools'
Label: UNMAPPED
Properties: 
RankerScore, -999.0; 
IsLinked, false; 
SurfaceFormWikiCatAttribs, school; 
TitleWikiCatAttribs, ; 
LinkerScore, -999.0; 
----------------------

Reading the paper didn't help me realize why these nouns were not mapped to Wikipedia. I appreciate your clarification.

Thanks,
Bahar

_______________________________________________
illinois-ml-nlp-users mailing list
illinois-ml-nlp-users AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/illinois-ml-nlp-users





Archive powered by MHonArc 2.6.16.

Top of Page