Show simple item record

 
dc.contributor.author Oleksy, Marcin
dc.contributor.author Dominiak, Daria
dc.contributor.author Wróż, Anita
dc.contributor.author Kobylińska, Wioleta
dc.contributor.author Kałkus, Dagmara
dc.contributor.author Zielińska, Kamila
dc.contributor.author Fikus, Dominika
dc.contributor.author Walentynowicz, Wiktor
dc.date.accessioned 2019-04-03T11:54:15Z
dc.date.available 2019-04-03T11:54:15Z
dc.date.issued 2019-04-03
dc.identifier.uri http://hdl.handle.net/11321/637
dc.description The Corpus of the Colloquial Polish Language (CCPL) is a UGC-based corpus tagged with morpho-syntactic features by the team of professional linguists from the Wrocław University of Technology. It consists of 400 000 tagged segments and has been used for training of the UGC-tagger, also available in the CLARIN repository. Main resources: Corpus files (NCP tagset): CCPL - anonimizacja_xml_out_ver(3.05).zip Manual annotation guidelines: Specification for morphosyntactic tagging of UGC texts.pdf Corpus files (UD tagset): corpus_petrov_tags.zip
dc.language.iso pol
dc.publisher SentiOne
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.rights.label CC
dc.source.uri https://sentione.com/knowledge/eu-research-project
dc.subject corpus
dc.subject user-generated content
dc.subject colloquial style
dc.subject morpho-syntactic tagging
dc.title Corpus of the Colloquial Polish Language
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN-PL
contact.person Michał Brzezicki michal@sentione.com SentiOne
sponsor ERDF POIR.01.01.01-00-0806/16 Senti Cognitive Services euFunds
files.size 31533400
files.count 5


 Files in this item
 Download all files in item (30.07 MB)

This item is
Distributed under Creative Commons
and licensed under:
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Attribution Required
Icon
Name
CCPL - anonimizacja_xml_out_ver(3.05).zip
Size
7.05 MB
Format
application/zip
Description
Corpus of the Colloquial Polish Language
 Download file
Icon
Name
Specification for morphosyntactic tagging of UGC texts.pdf
Size
157.46 KB
Format
PDF
Description
Annotation guidelines
 Download file
Icon
Name
anonimizacja_xml_out_ver(3.04).zip
Size
7.49 MB
Format
application/zip
Description
Colloquial language corpus for Polish
 Download file
Icon
Name
CCPL_petrov.zip
Size
6.18 MB
Format
application/zip
 Download file
Icon
Name
CCPL_rev_2.zip
Size
9.2 MB
Format
application/zip
 Download file

Show simple item record