illinois-ml-nlp-users AT lists.cs.illinois.edu

Subject: Support for users of CCG software closed 7-27-20

List archive

Re: [Illinois-ml-nlp-users] Wikifier Configuration

From: Lev-Arie Ratinov <arie.ratinov AT gmail.com>
To: Dat NguyenBa <datnb.nguyen AT gmail.com>, Mark Sammons <mssammon AT illinois.edu>, illinois-ml-nlp-users <illinois-ml-nlp-users AT cs.uiuc.edu>
Subject: Re: [Illinois-ml-nlp-users] Wikifier Configuration
Date: Wed, 5 Feb 2014 11:02:58 -0500
List-archive: <http://lists.cs.uiuc.edu/pipermail/illinois-ml-nlp-users/>
List-id: Support for users of CCG software <illinois-ml-nlp-users.cs.uiuc.edu>

Sorry, I've left the university over 2 years ago. That remote service was developed by someone else. I've CCed the UIUC ML-NLP mailing list and Mark Sammons, who's the main man there.

As to results - pls take a look at whether the annotations themselves make sense. Compare what's returned by the service with what' returned by the online demo. Run all sorts of sanity tests on it. Then include your findings in the next email.

Pls take me out of this thread.

Best.

Peace&Love

On Wed, Feb 5, 2014 at 3:57 AM, Dat NguyenBa <datnb.nguyen AT gmail.com> wrote:

Hi Lev,

Thanks a lot for your quick answer!

Could I please ask one more question. I tried to use Wikifier as a service working under my code as follows:

public class IllinoisWikifierService implements IllinoisWikifierServiceRemote {

InferenceEngine inference;

public IllinoisWikifierService(){
    try {
      ParametersAndGlobalVariables.loadConfig("./Config/Demo_Config_Deployed");
      ReferenceAssistant
          .initCategoryAttributesData(ParametersAndGlobalVariables.pathToTitleCategoryKeywordsInfo);
//      inference = new InferenceEngine(false);
      System.out.println("Started Illinois Wikifier");
    } catch (Exception e) {
      e.printStackTrace();
    }
}

public Map<ResultMention, ResultEntity> disambiguate(String docId,
      String text, List<Mention> mentions) throws RemoteException {
    try {
      DisambiguationProblem problem = null;
      Map<Integer, Mention> offset2mention = new HashMap<Integer, Mention>();
      TextAnnotation ta = ParametersAndGlobalVariables.curator.getTextAnnotation(text);


      Vector<ReferenceInstance> refs = new Vector<ReferenceInstance>();
      int i = 0;
      for (Mention m : mentions) { // extracted from a file in folder OriginalTextsWithAnnotations
        ReferenceInstance ref = new ReferenceInstance();
        ref.surfaceForm = m.getMention();
        ref.characterOffset = m.getCharOffset();
        ref.characterLength = m.getCharLength();
        refs.add(ref);
        offset2mention.put(m.getCharOffset(), m);

      }

      problem = new DisambiguationProblem(docId, ta, refs);
      inference = new InferenceEngine(false);
      inference.annotate(problem, null, false, false, 0);
      //        String wikificationString = problem.wikificationString(false);

      System.out.println("INFERENCE DONE: " + problem.components.size() + " mentions.");
      i = 0;
      for (WikifiableEntity e : problem.components) {
        System.out.println(
            e.topDisambiguation.wikiData.basicTitleInfo.getTitleSurfaceForm(),
            e.linkerScore);
        Mention m = offset2mention.get(e.startOffsetCharsInText);
        if (m == null) {
          System.out.println("No mention for " + e);
        } else {
           System.out.println("adding annotation: "
              + e.topDisambiguation.wikiData.basicTitleInfo.getTitleId()
              + " linkerScore:" + e.linkerScore + " normLinkerScore:"
              + (((float) e.linkerScore + 3) / 7) + " rankerscore:"
              + e.topDisambiguation.rankerScore);
        }
      }
    } catch (Exception e) {
      e.printStackTrace();
    }

    return results;
}
}

Basically, I follow the sample in the Readme file. However, the results on the test data is not very high, around 78%.

Did I do something wrong?

Best,
Dat

On Wed, Feb 5, 2014 at 1:39 AM, Lev-Arie Ratinov <arie.ratinov AT gmail.com> wrote:

Hi Dat. Thanks for the interest. I actually think that something containing "semantic and lexical" is the best. However, the demo config is almost as good. I think the demo uses only the lexical info.

Best.

Peace&Love

On Tue, Feb 4, 2014 at 11:28 AM, Dat NguyenBa <datnb.nguyen AT gmail.com> wrote:

Hi Lev,

I would like to ask a question related to Wikifier's configuration. Could you tell me the best config file I can use? Is it the "Demo_Config_Deployed" file as Readme recommends?

Cheers,
Dat

[Illinois-ml-nlp-users] Wikifier configuration, Dat NguyenBa, 02/04/2014
- Re: [Illinois-ml-nlp-users] Wikifier configuration, Xiao Cheng, 02/05/2014
- <Possible follow-up(s)>
- Re: [Illinois-ml-nlp-users] Wikifier Configuration, Lev-Arie Ratinov, 02/05/2014
  - Re: [Illinois-ml-nlp-users] Wikifier Configuration, Xiao Cheng, 02/05/2014
    - Re: [Illinois-ml-nlp-users] Wikifier Configuration, Dat NguyenBa, 02/05/2014