@PhilosTEI TICCLING PHILOSOPHY: A TEI CORPUS-BUILDING WORKFLOW TOWARDS A NEW COMPUTATIONAL METHODOLOGY FOR PHILOSOPHY

Description

The step to e-research in philosophy depends on the availability of high quality, easily accessible corpora in a sustainable format composed from multi-language, multi-script books from different historical periods. Corpora matching these needs are at the moment virtually non-existing. In this project we want to address this corpus building problem by developing and making available an open source, web-based, user-friendly workflow from textual digital images to TEI, based on an OCRopus/Tesseract webservice and a multilingual version of OCR-postcorrection webservice TICCLops. We shall demonstrate the tool on a multilingual, multi-script corpus of important 18th-20th-century European philosophical texts. These texts are of fundamental importance to understand the development of key scientific concepts such as explanation and truth in 18th-20th-century Europe. The tool will be of general interest and importance to solve problems of CLARIN-compliant corpora building.

Tool type

Processing flow

Tool task

corpus building

Key words

service, multi-lingual

Research domain

Philosophy

Language

Country

Netherlands

CLARIN centre

Huygens ING

Contact person

Dr. Arianna Betti

URL

http://www.clarin.nl/node/1404

Similar to