Entity Enrichment
... TextRazor can enrich each entity instance it identifies with structured data from various linked data sources.
Sometimes your application needs more than just Entity IDs and labels from the mentions identified in your content.
We index billions of facts including place geolocation information, multilingual descriptions, birth dates and much more. The engine knows, for example, that London in England is the same concept as en.wikipedia.org/wiki/London, or /m/04jpl in Freebase, information we can use to look up extra relevant, targeted data for you to use in your application. We take care of the hefty dataset cleaning, indexing and update process for you, allowing you to effortlessly build on real-world data.
The TextRazor API allows you to add simple queries to your requests to help extract only the specific information you need.
TextRazor's client SDKs make it easy to use Entity Enrichment in your application. The official TextRazor SDKs allow you to pass in multiple queries and handle parsing the response for you. Results from your queries will appear as part of the "data" field of each entity.
The REST API expects an array of entities.freebaseEnrichmentQueries
strings. For more integration details, head over to the documentation for each SDK.
Queries use the same format as found on the Freebase website or API. Each Freebase object has a number of links to other objects - for example on http://www.freebase.com/m/04jpl?links= you can see the "London" entity is linked to thousands of other facts by "predicates". TextRazor's enrichment queries allow you to specify several of these predicates to extract the information you need in your application.
Each query consists of a source prefix fbase:
, and one or more predicates separated by a '>'. Where multiple predicates are specified, TextRazor will follow the results of each subquery to reach the final answer. You may need to follow several links to get the exact data you need. For example, in Freebase geolocation information is stored in a separate object for each entity. The query fbase:/location/location/geolocation
will return a freebase mid id. TextRazor can expand this to a full longitude with the query fbase:/location/location/geolocation>/location/geocode/longitude
Each query returns an array of results, more than one result may match your query (for example, in the case of multilingual descriptions).
Latitude and Longitude for places | fbase:/location/location/geolocation>/location/geocode/latitude, fbase:/location/location/geolocation>/location/geocode/longitude |
Multilingual descriptions | fbase:/common/topic/description |
Entity Synonyms | fbase:/common/topic/alias |
Example images | fbase:/common/topic/image>/type/content/source>/type/content_import/uri |
Official websites | fbase:/common/topic/official_website |
Feel free to get in touch to discuss the best way to get the exact data you need in your application.
TextRazor currently indexes Freebase, specifically the complete set of data available at http://www.freebase.com without the "/base/" and "/user/" bases. We find this provides the most coverage for common use cases, but there's another linked data source that would help please get in touch and we'll do our best to get it included.
Linked Data provided by the TextRazor API is provided under the same license as the original source. Freebase data is licensed under the Creative Commons Attribution 2.5 license.
TextRazor's enrichment database runs on a redundant SSD-backed DB cluster, ensuring queries add minimal latency to your requests. To help maintain the performance of your requests we impose several default limits:
If you require higher limits for your application, please contact support, in most cases we will happily increase the limit for you. TextRazor's indexes are frequently updated, but may not contain the latest changes from the source dataset.