Skip to Content.
Sympa Menu

illinois-ml-nlp-users - Re: [Illinois-ml-nlp-users] chunker error

illinois-ml-nlp-users AT lists.cs.illinois.edu

Subject: Support for users of CCG software closed 7-27-20

List archive

Re: [Illinois-ml-nlp-users] chunker error


Chronological Thread 
  • From: "Sammons, Mark" <mssammon AT illinois.edu>
  • To: changwen yang <ychangwen AT hotmail.com>, "illinois-ml-nlp-users AT cs.uiuc.edu" <illinois-ml-nlp-users AT cs.uiuc.edu>
  • Subject: Re: [Illinois-ml-nlp-users] chunker error
  • Date: Wed, 15 Oct 2014 12:59:12 +0000
  • Accept-language: en-US
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/illinois-ml-nlp-users/>
  • List-id: Support for users of CCG software <illinois-ml-nlp-users.cs.uiuc.edu>

Hi, Changwen.

This behavior occurs because the Chunker is more or less a pure machine-learned model, and will not predict a chunk unless the extracted features give it sufficient evidence. This might happen because of out-of-vocabulary words, unusual word sequences, or tokenization errors .  The words "and", "but" and "not" behave like logical connectives, and it would be possible to add heuristics to fix these; we do have a team working on improving our Chunker and I will pass your observations on to them.

Thanks,

Mark


From: illinois-ml-nlp-users-bounces AT cs.uiuc.edu [illinois-ml-nlp-users-bounces AT cs.uiuc.edu] on behalf of changwen yang [ychangwen AT hotmail.com]
Sent: Tuesday, October 07, 2014 4:01 PM
To: illinois-ml-nlp-users AT cs.uiuc.edu
Subject: [Illinois-ml-nlp-users] chunker error

Dear Sir:
 
I tested your chunker, and found some words were not chunked "a clockmaker but" below. I also found that
word "and" "or" "not" "but" were always not chunked. Hope to know why and how to fix the problem.
 
Thanks and hope to hear from you soon.
 
Paul 
 
 [NP  His bother also called Daniel] , [VP  became]  a clockmaker but [NP  George]  [VP  followed]  [NP  his father trade] , [VP  taking]  [PP  over]  [NP  the firm]  [PP  on]  [NP  NP]  [NP  his death]  [PP  in]  [NP  1840]



Archive powered by MHonArc 2.6.16.

Top of Page