Cat

Name

Cat – a simple text classification tool

Description

The tool allows for the classification of texts according to one of the following criteria: (1) thematic classification according to the machine-learning model based on the five categories of Wikipedia, (2) thematic classification according to the machine-learning model based on press topics, (3) classification according to the similarity of the grammatical style to the style of one of the well-known Polish writers of the 19th and 20th centuries, (4) in the case of multilingual corpora, detecting the contribution of a particular language within the whole corpus. The analysis can be performed on any corpus packed as a .zip archive. It is also possible to perform advanced classification according to other models or large amounts of texts – for this purpose, please contact: webserwisy@clarin-pl.eu

Auxiliary materials:

Link to the manual

Examples of applications