The following list shows selected examples of applications, tools and resources available and being developed within CLARIN – the pan-European research infrastructure for language technology. The list includes numerous services of different types which enable a variety of applications in humanities and social sciences. The indicated similarities and analogies between applications, tools and resources reveal the potential and possibilities of cooperation in creating services of cross-lingual range and in combining the existing services and applications in more comprehensive solutions.
1. Genuine applications – applications available online which do not require installation or technical knowledge and which answer a specific research question;
2. Processing flows – processing chains created by a user by choosing from the available tools and providing own research material;
3. Tools for creating own tools and resources – tools that usually require installation and technical knowledge of varying levels;
4. Resources – corpora, dictionaries, databases and wordnets providing research material, sometimes containing tools for quantitative and qualitative analysis of varying degrees of complexity;
5. Tree bank tools.
Genuine applications
- ADEPT: Assaying Differences via Edit-Distance of Pronunciation Transcriptions – Gabmap
- CiNaViz: New web application for visualization of European city names
- COAVA: Cognition, Acquisition and Variation Tool
- Česílko
- GetClasS: Generalised Text Classification for Sociology
- Literary Map
- MIGMAP: Detailed interactive mapping of migration in The Netherlands in the 20th century
- Moses Web Demo
- NameScape: Mapping the Landscape of Names in Modern Dutch Literature
- Polimedia: Interlinking multimedia for the analysis of media coverage of political debates
- STYX
- System for multiword units extraction
- WAHSP/BILAND: web application for (bilingual) historical sentiment mining in public media
Processing flows
- @PhilosTEI
- TiCCLops: Text-Induced Corpus Clean-up online processing system
- TTNWW – TST Tools voor het Nederlands als Webservices in een Workflow
- WebLicht
Tools for creating own tools and resources
- AAM-LR: Automatic Annotation of Multi-modal Language Resources
- Adelheid: A Distributed Lemmatizer for Historical Dutch
- ANVIL: The video annotation research tool
- Blog-Reader
- EXMARaLDA: Extensible Markup Language for Discourse Annotation
- Feature-based tagger
- Grafon
- HMM tagger
- INPOLDER: Integrated Parser and Lemmatizer Dutch in Retrospect
- jusText
- Korektor
- Lexical Annotation Workbench (LAW)
- LEXUS
- Liner2 (NER)
- MeED
- MORČE
- MORFO
- MorphoDiTa: Morphological Dictionary and Tagger
- NameTag
- Person name recognizer
- Polish Aligner
- Polish ASR
- Polish G2P
- Polish Keyword Detection
- POS-tagger moot
- QuaMeRDES
- Saper
- Speaker Diarization
- Tagger (WCRFT2)
- TQE: Transcription Quality Evaluation
- Treex::Web
- VAD
- Vector Extractor
- Victor
- Victoria
- W2C – Web to Corpus – tool
- WebMaus: Automatic Segmentation and Labelling of Speech Signals over the Web
Resources
- Arthurian Fiction
- C4
- C-DSD: Curating the Dutch Song Database
- CKCC – Scholarly Letters
- CLAVAS: CLARIN Vocabulary Access Service
- COBWWWEB Connections Between Women and Writings Within European Borders
- Cornetto: Combinatorial and Relational Network as Toolkit for Dutch Language Technology
- Corpus of Old Bulgarian
- COSMAS II
- Dialogy.Org
- DictGate
- DiscAN: Towards a Discourse Annotation system for Dutch language corpora
- D-LUCEA: Database of the Longitudinal Utrecht Collection of English Accents
- DSS Dutch Ships and Seamen
- DUELME: Dutch Electronic Lexicon of Multiword Expressions
- e-BNM+ Linked Data on Middle Dutch Sources Kept Worldwide
- ElixirFM
- EMIT-X: Early-Modern Image and Text eXchange
- Fast lexical and phonetic search in the MALACH archive
- FESLI: Functional elements in Specific Language Impairment
- Gesta Danorum
- GrNe: Greek-Dutch dictionary
- Integration of Historical Digital Fulltexts (15-19 century) into the CLARIN-D Infrastructure
- INTER-VIEWs: Curation of Interview Data
- IPROSLA: Integrating and Publishing Resources on Sign Language Acquisition
- Keeleveeb Query
- KonText Web Demo
- LAISEANG: Language Archive of Insular South East Asia and West New Guinea
- Mecmua
- MIMORE: Microcomparative Morphosyntax Research Tool
- Multilingual data
- NEHOL: Negerhollands Database
- OpenSoNaR Online Personal Exploration and Navigation of SoNaR
- Paralela: parallel data search engine
- PILNAR: Pilgrimage Narratives – a corpus for studying the profile of the modern pilgrim
- plSentiwordnet
- plWordNet
- RemBench A Digital Workbench for Rembrandt Research
- ROMi Multimodal Corpus of Czech as a Second Language
- SaCoCo Saarbrücken Cookbook Corpus
- SHEBANQ System for HEBrew Text: ANnotations for Queries and Markup
- Spokes: conversational data search and exploration
- The Glossa corpus search system
- The Internet Language Reference Book
- The Place Names Database (KNAB)
- VALID – Vulnerability in Acquisition: Language Impairments in Dutch. Curating five valuable data sets
- vicav: Vienna Corpus of Arabic Varieties
- VK: Verrijkt Koninkrijk (Enriched Kingdom)
- VU-DNC: VU Diachronic Newspaper Corpus
- Vystadial
- WFT-GTB: Integrating the Wurdboek fan ˈe Fryske Taal into the Geïntegreerde TaalBank
- WIP: War in Parliament
- WordTies
Tree bank tools
- GrETEL Search for Syntactic Constructions
- LASSY Word Relations Search Web Application
- Netgraph
- PML Tree Query
- Searching complex treebanks: the PML-TQ search engine and interface
- Tree Editor TrEd
Other