illinois-ml-nlp-users AT lists.cs.illinois.edu

Subject: Support for users of CCG software closed 7-27-20

List archive

Re: [[Illinois-ml-nlp-users] ] Some questions regarding "Design Challenges and Misconceptions in Named Entity Recognition"

From: Lev Ratinov <arie.ratinov AT gmail.com>
To: Adrian van der Lek <adrian.vanderlek AT uzh.ch>, illinois-ml-nlp-users <illinois-ml-nlp-users AT cs.uiuc.edu>
Subject: Re: [[Illinois-ml-nlp-users] ] Some questions regarding "Design Challenges and Misconceptions in Named Entity Recognition"
Date: Sun, 24 Apr 2016 15:21:40 -0400

Sorry, I left academia 5 years ago and I don't have time to answer these questions.

But I forward this to illinois-ml-nlp-users AT cs.uiuc.edu, I'm sure they'll be able to direct you to the corpus and maybe answer the questions.

Best.

Peace&Love

On Sun, Apr 24, 2016 at 1:51 PM, Adrian van der Lek <adrian.vanderlek AT uzh.ch> wrote:

Dear Mr. Ratinov

I'm an undergraduate student of computational linguistics at the
University of Zurich and will be giving a presentation for a NER
seminar, specifically concerning methodological problems, where I'd like
to discuss the paper "Design Challenges and Misconceptions in Named
Entity Recognition" you co-authored. If you find the time, it would help
me if you could briefly elaborate on the following points:

- I attempted to download the webpages corpus, but the link on your
website at the University of Illinois
(https://cogcomp.cs.illinois.edu/page/resource_view/28) appears to be
dead. Is it perhaps available elsewhere? Although, I'm mainly interested
in the total number of tokens to get an idea of the NE/non-NE ratio, so
I don't necessarily need the files.

- On p. 150, you elaborate why global inference over the HMM does not
capture non-local properties. However, I'm not sure I understand the
explanation. By "... NEs tend to be short chunks separated by multiple
"outside" tokens", do you mean NEs that are bracketed or subdivided by
outside tokens? In case of the former, I'm not sure I understand why
this "breaks" the decision process. I admittedly do not have a very deep
understanding of the Viterbi algorithm, though.

- On p.151, you detail the prediction history feature and it seems to me
that the feature is computed accross predictions of the baseline NER
system discussed in section 5.2., but this is not mentioned explicitely.
Is my interpretation correct?

Many thanks for your response!

Kind Regards,

Adrian van der Lek

Re: [[Illinois-ml-nlp-users] ] Some questions regarding "Design Challenges and Misconceptions in Named Entity Recognition", Lev Ratinov, 04/24/2016