Skip to Content.
Sympa Menu

illinois-ml-nlp-users - Re: [Illinois-ml-nlp-users] Wikifier Configuration

illinois-ml-nlp-users AT lists.cs.illinois.edu

Subject: Support for users of CCG software closed 7-27-20

List archive

Re: [Illinois-ml-nlp-users] Wikifier Configuration


Chronological Thread 
  • From: Dat NguyenBa <datnb.nguyen AT gmail.com>
  • To: Xiao Cheng <cheng88 AT illinois.edu>
  • Cc: illinois-ml-nlp-users <illinois-ml-nlp-users AT cs.uiuc.edu>
  • Subject: Re: [Illinois-ml-nlp-users] Wikifier Configuration
  • Date: Wed, 5 Feb 2014 20:35:22 +0100
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/illinois-ml-nlp-users/>
  • List-id: Support for users of CCG software <illinois-ml-nlp-users.cs.uiuc.edu>

Hi Xiao,

Thanks a lot for your email with clear explanations. Highly appreciate!
Yes, please let me know when you have the new version.

Best,
Dat


On Wed, Feb 5, 2014 at 7:54 PM, Xiao Cheng <cheng88 AT illinois.edu> wrote:
Hi Dat,

I spotted two potential problems:

1. The demo configuration is designed for web use (where redirects does not matter), for evaluation please copy your configuration file to a new one and make sure the configuration file has the following parameters:

pathToModels=/WikiData/Models/TitleMatchPlusLexicalPlusCoherence
pathToEvaluationRedirectsData=/WikiData/RedirectsForEvaluationData/RedirectsAug2010.txt

useLexicalFeaturesNaive=false
useLexicalFeaturesReweighted=true

If you still have problems, please attach the configuration file you were using.

2. The service code doesn't interpret the output correctly:
A. There is no reason to normalize the linker score as "(((float) e.linkerScore + 3) / 7)"
B. When e.linkerScore<0, you should discard the e.topDisambiguationCandidate and interpret the output as "This mention refers to something outside of Wikipedia". Therefore, please use the output from e.finalDisambiguationCandidate instead (it could be null).

We are moving to a newer version sometime this month, if you can wait, I would recommend using the new version as it will contain the up-to-date Wikipedia information.

Thanks,

Xiao


On Wednesday, February 5, 2014, Lev-Arie Ratinov <arie.ratinov AT gmail.com> wrote:
Sorry, I've left the university over 2 years ago. That remote service was developed by someone else. I've CCed the UIUC ML-NLP mailing list and Mark Sammons, who's the main man there.

As to results - pls take a look at whether the annotations themselves make sense. Compare what's returned by the service with what' returned by the online demo. Run all sorts of sanity tests on it. Then include your findings in the next email.

Pls take me out of this thread.

Best.



Peace&Love


On Wed, Feb 5, 2014 at 3:57 AM, Dat NguyenBa <datnb.nguyen AT gmail.com> wrote:
Hi Lev,

Thanks a lot for your quick answer!

Could I please ask one more question. I tried to use Wikifier as a service working under my code as follows:

public class IllinoisWikifierService implements IllinoisWikifierServiceRemote {

  InferenceEngine inference;

  public IllinoisWikifierService(){
    try {
      ParametersAndGlobalVariables.loadConfig("./Config/Demo_Config_Deployed");
      ReferenceAssistant
          .initCategoryAttributesData(ParametersAndGlobalVariables.pathToTitleCategoryKeywordsInfo);
//      inference = new InferenceEngine(false);
      System.out.println("Started Illinois Wikifier");
    } catch (Exception e) {
      e.printStackTrace();
    }
  }

  public Map<ResultMention, ResultEntity> disambiguate(String docId,
      String text, List<Mention> mentions) throws RemoteException {
    try {
      DisambiguationProblem problem = null;
      Map<Integer, Mention> offset2mention = new HashMap<Integer, Mention>();
      TextAnnotation ta = ParametersAndGlobalVariables.curator.getTextAnnotation(text);

     
      Vector<ReferenceInstance> refs = new Vector<ReferenceInstance>();
      int i = 0;
      for (Mention m : mentions) { // extracted from a file in folder OriginalTextsWithAnnotations
        ReferenceInstance ref = new ReferenceInstance();
        ref.surfaceForm = m.getMention();
        ref.characterOffset = m.getCharOffset();
        ref.characterLength = m.getCharLength();
        refs.add(ref);
        offset2mention.put(m.getCharOffset(), m);
       
      }

      problem = new DisambiguationProblem(docId, ta, refs);
      inference = new InferenceEngine(false);
      inference.annotate(problem, null, false, false, 0);
      //        String wikificationString = problem.wikificationString(false);

      System.out.println("INFERENCE DONE: " + problem.components.size() + " mentions.");
      i = 0;
      for (WikifiableEntity e : problem.components) {
        System.out.println(
            e.topDisambiguation.wikiData.basicTitleInfo.getTitleSurfaceForm(),
            e.linkerScore);
        Mention m = offset2mention.get(e.startOffsetCharsInText);
        if (m == null) {
          System.out.println("No mention for " + e);
        } else {
           System.out.println("adding annotation: "
              + e.topDisambiguation.wikiData.basicTitleInfo.getTitleId()
              + " linkerScore:" + e.linkerScore + " normLinkerScore:"
              + (((float) e.linkerScore + 3) / 7) + " rankerscore:"
              + e.topDisambiguation.rankerScore);
        }
      }   
    } catch (Exception e) {
      e.printStackTrace();
    }
   
    return results;
  }
}

Basically, I follow the sample in the Readme file. However, the results on the test data is not very high, around 78%.

Did I do something wrong?

Best,
Dat


On Wed, Feb 5, 2014 at 1:39 AM, Lev-Arie Ratinov <arie.ratinov AT gmail.com> wrote:
Hi Dat. Thanks for the interest. I actually think that something containing "semantic and lexical" is the best. However, the demo config is almost as good. I think the demo uses only the lexical info.

Best.

Peace&Love


On Tue, Feb 4, 2014 at 11:28 AM, Dat NguyenBa <datnb.nguyen AT gmail.com> wrote:
Hi Lev,

I would like to ask a question related to Wikifier's configuration. Could you tell me the best config file I can use? Is it the "Demo_Config_Deployed" file as Readme recommends?

Cheers,
Dat



Hi




Archive powered by MHonArc 2.6.16.

Top of Page