The main idea is to identify all geographical names in the literary text (or a corpus) and map them onto the geographical map. The task goes beyond Named Entity Recognition (NER), as NER must be combined with geo-location. We use geo-location service provided by Google, but still location PNs recognised in text must be grouped into expression recognised by Google in a way enabling good accuracy of locating them. We proposed to expand the initial idea with recognition of semantic relations linking non-spational PNs in the text with the location PNs and visualising those links on the map, too. Recognition of the temporal expression could further enrich the application. Two scenarios of use are considered: fully automated and bootstrapping. According to the first, users process whole corpora of literary texts and next can analyse collected statistical data or browse mapping of the individual texts. However, due to the limited accuracy of the whole system, the second scenarios in which the system is used a supporting tool during corpus annotation with mappings is more likely in research – annotations proposed by the system are next corrected by the researchers.
search, NE recognition, visualisation
Cultural Sciences, Literary Studies
Wrocław University of Technology, Instytut Badań Literackich Państwowej Akademii Nauk
Wojciech Gaweł, Maciej Maryl, Aleksandra Wójtowicz