com.textrazor.annotations
Class Entity

java.lang.Object
  extended by com.textrazor.annotations.Annotation
      extended by com.textrazor.annotations.Entity

public class Entity
extends Annotation


Constructor Summary
Entity()
           
 
Method Summary
 double getConfidenceScore()
           
 java.util.List<java.lang.String> getDBPediaTypes()
           
 int getEndingPos()
           
 java.lang.String getEntityEnglishId()
           
 java.lang.String getEntityId()
           
 java.lang.String getFreebaseId()
           
 java.util.List<java.lang.String> getFreebaseTypes()
           
 int getId()
           
 java.lang.String getMatchedText()
           
 java.util.List<java.lang.Integer> getMatchingTokens()
           
 java.util.List<Word> getMatchingWords()
           
 double getRelevanceScore()
           
 int getStartingPos()
           
 java.util.List<java.lang.String> getType()
           
 java.lang.String getWikiLink()
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Entity

public Entity()
Method Detail

getMatchingWords

public java.util.List<Word> getMatchingWords()
Returns:
List of the Word objects that make up this entity.

getId

public int getId()
Returns:
The ID of this annotation.

getEntityId

public java.lang.String getEntityId()
Returns:
the ID for this entity, or null if this entity could not be disambiguated. This ID is from the localized Wikipedia for this document's language.

getFreebaseId

public java.lang.String getFreebaseId()
Returns:
the disambiguated Freebase ID for this entity, or null if either this entity could not be disambiguated, or a Freebase link doesn't exist.

getWikiLink

public java.lang.String getWikiLink()
Returns:
the full canonical link to Wikipedia for this entity, or null if either this entity could not be disambiguated or a Wikipedia link doesn't exist.

getMatchedText

public java.lang.String getMatchedText()
Returns:
the source text string that matched this entity.

getStartingPos

public int getStartingPos()
Returns:
The start offset in the input text for this entity. Note that TextRazor treats multi byte utf8 characters as a single position.

getEndingPos

public int getEndingPos()
Returns:
The end offset in the input text for this entity. Note that TextRazor treats multi byte utf8 characters as a single position.

getFreebaseTypes

public java.util.List<java.lang.String> getFreebaseTypes()
Returns:
List of Freebase types for this entity, or an empty list if there are none.

getDBPediaTypes

public java.util.List<java.lang.String> getDBPediaTypes()
Returns:
List of DBPedia types for this entity, or an empty list if there are none.

getRelevanceScore

public double getRelevanceScore()
Returns:
The relevance this entity has to the source text. This is a float on a scale of 0 to 1, with 1 being the most relevant. Relevance is determined by the contextual similarity between the entities context and facts in the TextRazor knowledgebase.

getConfidenceScore

public double getConfidenceScore()
Returns:
The confidence that TextRazor is correct that this is a valid entity. TextRazor uses an ever increasing number of signals to help spot valid entities, all of which contribute to this score. These include the contextual agreement between the words in the source text and our knowledgebase, agreement between other entities in the text, agreement between the expected entity type and context, prior probabilities of having seen this entity across wikipedia and other web datasets. The score ranges from 0.5 to 10, with 10 representing the highest confidence that this is a valid entity.

getMatchingTokens

public java.util.List<java.lang.Integer> getMatchingTokens()
Returns:
List of the word positions in the current sentence that make up this entity.

getType

public java.util.List<java.lang.String> getType()
Returns:
List of DBPedia types for this entity, or an empty array if there are none.

getEntityEnglishId

public java.lang.String getEntityEnglishId()
Returns:
The disambiguated entityId in the English Wikipedia, where a link between localized and English ID could be found. Null if either the entity could not be linked, or where a language link did not exist.