TextRazor supports the automatic detection of 142 languages.
TextRazor also supports named entity recognition, entity disambiguation and linking, topic detection and taxonomy classification for the following languages:
TextRazor disambiguates and links all entities in all supported languages. Where an entity can be linked to an English equivalent both the localized and English Wikipedia IDs are returned.
TextRazor can automatically detect 142 languages using the contents of your text. The detected language is used for the appropriate processing logic, and returned to you in the TextRazor response. For short text (Tweets for example), there may not be enough context to accurately determine the language. If this is likely to be a problem you can pass "languageOverride" with the request to specify a processing language and skip the automatic detection stage.
Language | ISO-639-2 Code |
---|---|
English | eng |
Danish | dan |
Dutch | dut |
Finnish | fin |
French | fre |
German | ger |
Hebrew | heb |
Italian | ita |
Japanese | jpn |
Korean | kor |
Norwegian | nor |
Polish | pol |
Portuguese | por |
Russian | rus |
Spanish | spa |
Swedish | swe |
Chinese | chi |
Czech | cze |
Greek | gre |
Icelandic | ice |
Latvian | lav |
Lithuanian | lit |
Romanian | rum |
Hungarian | hun |
Estonian | est |
Bulgarian | bul |
Croatian | scr |
Serbian | scc |
Irish | gle |
Galician | glg |
Turkish | tur |
Ukrainian | ukr |
Hindi | hin |
Macedonian | mac |
Bengali | ben |
Indonesian | ind |
Latin | lat |
Malay | may |
Malayalam | mal |
Welsh | wel |
Nepali | nep |
Telugu | tel |
Albanian | alb |
Tamil | tam |
Belarusian | bel |
Javanese | jav |
Occitan | oci |
Urdu | urd |
Bihari | bih |
Gujarati | guj |
Thai | tha |
Arabic | ara |
Catalan | cat |
Esperanto | epo |
Basque | baq |
Interlingua | ina |
Kannada | kan |
Punjabi | pan |
Scots_Gaelic | gla |
Swahili | swa |
Slovenian | slv |
Marathi | mar |
Maltese | mlt |
Vietnamese | vie |
Frisian | fry |
Slovak | slo |
Faroese | fao |
Sundanese | sun |
Uzbek | uzb |
Amharic | amh |
Azerbaijani | aze |
Georgian | geo |
Tigrinya | tir |
Persian | per |
Bosnian | bos |
Sinhalese | sin |
Norwegian_N | nno |
Xhosa | xho |
Zulu | zul |
Guarani | grn |
Sesotho | sot |
Turkmen | tuk |
Kyrgyz | kir |
Breton | bre |
Twi | twi |
Yiddish | yid |
Somali | som |
Uighur | uig |
Kurdish | kur |
Mongolian | mon |
Armenian | arm |
Laothian | lao |
Sindhi | snd |
Rhaeto_Romance | roh |
Afrikaans | afr |
Luxembourgish | ltz |
Burmese | bur |
Khmer | khm |
Tibetan | tib |
Dhivehi | div |
Oriya | ori |
Assamese | asm |
Corsican | cos |
Interlingue | ine |
Kazakh | kaz |
Lingala | lin |
Moldavian | mol |
Pashto | pus |
Quechua | que |
Shona | sna |
Tajik | tgk |
Tatar | tat |
Tonga | tog |
Yoruba | yor |
Maori | mao |
Wolof | wol |
Abkhazian | abk |
Afar | aar |
Aymara | aym |
Bashkir | bak |
Bislama | bis |
Dzongkha | dzo |
Fijian | fij |
Greenlandic | kal |
Hausa | hau |
Inupiak | ipk |
Inuktitut | iku |
Kashmiri | kas |
Kinyarwanda | kin |
Malagasy | mlg |
Nauru | nau |
Oromo | orm |
Rundi | run |
Samoan | smo |
Sango | sag |
Sanskrit | san |
Siswant | ssw |
Tsonga | tso |
Tswana | tsn |
Volapuk | vol |
Zhuang | zha |
Ganda | lug |
Manx | glv |