Topic – a tool used for topic modeling


Topic is a tool used for making topic analysis of text corpora. It enables the designation and extraction of a user-defined number of topics from the corpus. The programme allows the corpus to be entered as a .zip archive or as a link to an archive uploaded on the Clarin Cloud platform or other data repositories. The result of the analysis can be viewed on the website in the form of visualisations and as downloadable data files (json, xslx). It is also possible to download visualisations of individual topics – word cloud files.

What is topic analysis? It is a complex operation on a collection of texts, as a result of which the programme determines characteristic groups of words that condition their occurrence in the whole set of texts. These groups are often semantically coherent – then we can talk about the isolation of a topic/theme in the classical sense of the term.

Auxiliary materials:

Link to the manual

Examples of applications

Grouping of texts according to similarity, similarity analysis, semantic analysis of texts, genological analysis