KonText – an open-source corpus search engine of the Polish language


KonText is an advanced, highly customizable web service that integrates corpus query interface on top of core libraries of the open-source corpus search engine. Thanks to the in-built tools, it is possible to upload and search user’s own text corpora, which are automatically annotated at the morphosyntactic layer. Language corpora, which lie on the intersection of corpus linguistics and computer technologies, are huge collections of texts, used in corpus research, applied linguistics and lexicography. Linguistic corpora are understood as collections of texts that are designed to efficiently extract, classifiy and verify the information regarding the formal structure and the content of the language. The application of corpus methods with the use of appropriate tools and digital databases enables the users to significantly extend the scope of research, eliminate the time-consuming process of manual annotation, conduct manual statistics, etc. Examples of applications of corpus analysis include measuring the frequency of words, phrases and collocations; exploring the most common contexts of word or phrase occurrences; examining language changes over time, using historical text corpora, studying the actual use of language by its users (domain-specific corpora, foreign language corpora).

Bibliographic address of the main publication (in case of using KonText, please cite this publication):

Auxiliary materials:

Link to the manual

Examples of applications