Integration of Historical Digital Fulltexts (15-19 century) into the CLARIN-D Infrastructure


The curation project aims to improve the current status of German corpora of the 15th to 19th century. To this end, digital full text resources (and corresponding scans of the original prints) are to be identified, catalogued, characterized and evaluated based on selective quality criteria. After this, the resources are to be gradually edited and integrated into a mirrored repository-structure at BBAW, HAB and IDS. Publishing the resources at a relatively early stage will make it possible to use and comment on these very soon. Following this, IDS and BBAW will gradually integrate the resources into the CLARIN infrastructure.
It is expected that this project will contribute substantially to a corpus of Early New High German and the historical New High German (15th-19th century). This structured and lasting corpus will fundamentally improve the situation for research, e.g. in historical linguistics.
Also the implementation of a sustainable infrastructure for integration of new text resources might initiate a major shift in the research community. This way the project will try to establish a culture of sharing and collaborative work on text resources. The infrastructure also will be a long-term central repository for historical text resources.

Tool type


Tool task

corpus exploration

Key words

data, corpus

Research domain

History, Historical Linguistics, Linguistics





CLARIN centre

Berlin-Brandenburgische Akademie der Wissenschaften 

Contact person

Christian Thomas


Similar to