Extract Machine-Readable Text
Using Tesseract
Afrikaans
Amharic
Arabic
Assamese
Azerbaijani
Uzbek (Cyrillic)
Belarusian
Bengali
Tibetan Standard
Bosnian
Breton
Bulgarian
Catalan
Cebuano
Czech
Chinese - Simplified
Japanese (vertical) script
Chinese - Traditional
Cherokee
Corsican
Welsh
Danish
German
Divehi
Dzongkha
Greek
English
English, Middle (1100-1500)
Esperanto
Estonian
Basque
Faroese
Persian
Filipino
Finnish
French
German (Fraktur)
French, Middle (ca.1400-1600)
Frisian (Western)
Gaelic (Scots)
Irish
Galician
Greek, Ancient (to 1453)
Gujarati
Haitian
Hebrew
Hindi
Croatian
Hungarian
Armenian
Inuktitut
Indonesian
Icelandic
Italian
Spanish, Castilian - Old
Javanese
Japanese
Kannada
Georgian
Kazakh
Khmer
Kyrgyz
Kurmanji (Latin)
Korean
Lao
Latin
Latvian
Lithuanian
Luxembourgish
Malayalam
Marathi
Macedonian
Maltese
Mongolian
Maori
Malay
Burmese
Nepali
Dutch
Norwegian
Occitan (post 1500)
Oriya
script and orientation
Punjabi
Polish
Portuguese
Pashto
Quechua
Romanian
Russian
Sanskrit
Arabic script
Armenian script
Bengali script
Canadian Aboriginal script
Cherokee script
Devanagari script
Ethiopic script
Fraktur script
Georgian script
Greek script
Gujarati script
Gurmukhi script
Hangul script
Han - Simplified script
Han - Traditional script
Hebrew script
Japanese script
Khmer script
Kannada script
Lao script
Serbian (Latin)
Malayalam script
Myanmar script
Oriya (Odia) script
Sinhala script
Syriac script
Tamil script
Telugu script
Thaana script
Thai script
Tibetan script
Vietnamese script
Sinhala
Slovakian
Slovenian
Sindhi
Spanish
Albanian
Serbian
Sundanese
Swahili
Swedish
Syriac
Tamil
Tatar
Telugu
Tajik
Thai
Tigrinya
Tonga
Turkish
Uyghur
Ukrainian
Urdu
Uzbek
Vietnamese
Yiddish
Yoruba
Lemmatize text
Texts