Skip to Content.
Sympa Menu

illinois-ml-nlp-users - [Illinois-ml-nlp-users] Illinois NE Tagger - output format

illinois-ml-nlp-users AT lists.cs.illinois.edu

Subject: Support for users of CCG software closed 7-27-20

List archive

[Illinois-ml-nlp-users] Illinois NE Tagger - output format


Chronological Thread 
  • From: Nikos Papasarantopoulos <npapasa AT ilsp.gr>
  • To: illinois-ml-nlp-users AT cs.uiuc.edu
  • Subject: [Illinois-ml-nlp-users] Illinois NE Tagger - output format
  • Date: Wed, 7 May 2014 14:53:44 +0300
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/illinois-ml-nlp-users/>
  • List-id: Support for users of CCG software <illinois-ml-nlp-users.cs.uiuc.edu>

Hi all,

I would like to ask if there is any way or configuration for the NE server to leave the text intact in the output.
I have some files that I want to run NER on, but they contain a lot of whitespace and punctuation. After running NER, the output has eliminated whitespace and added spaces between the punctuation marks.
After NER, I would like to know for every entity the offset (how many chars from the beginning of the original document it was recognized); that's why I want the document layout not changing.
For example, in the online demo, the text returned has the same layout with the original.
Is there any kind of configuration by which I could solve this?
Thank you in advance.

N.P.



  • [Illinois-ml-nlp-users] Illinois NE Tagger - output format, Nikos Papasarantopoulos, 05/07/2014

Archive powered by MHonArc 2.6.16.

Top of Page