TermoPL

Name

TermoPL – a tool for extracting terminology

Description

TermoPL is a tool created to extract terminology from domain corpora in Polish. The programme may also be used to extract multi-word expressions. Extracting terminology may be useful, for instance, while preparing specialised dictionaries, providing sources which would aid translating and summarising, developing the ontology of a given domain, annotating documents or assisting in the search for answers to questions.

The programme extracts noun phrases which might be considered as candidates for terms characteristic for a chosen domain. To do this, TermoPL uses simple grammar which might be adjusted by the users according to their needs. It is suitable for various tagsets, and then, it can be used to extract terminology from texts in many languages. The program accepts data which has already been annotated with word category information, or with other morphological tags. For Polish, it is possible to work with unannotated text corpora. In this case, the input text is first annotated with the aid of the Concraft tool.