dc.description TaKIPI is a tagger of Polish language that is a tool which assigns morpho-syntactic markers to words in the text. The tagger assumes a morpho-syntactic description of IPI PAN corpus tagset. Contextual disambiguation is carried out via a small set of hand-written rules and via a bigger number of rules automatically extracted by means of the algorithm of the induction of decision trees C4.5. During the process of tagger's learning and functioning, the context of each word's occurence in the text is represented as a feature vector of a constant length. Such vector is obtained by means of hand-written functional expressions of JOSKIPI formalism, which refer to morpho-syntactic properties of the context.
Wrocław University of Technology
TaKIPI
