Show simple item record

 
dc.contributor.author Kocoń, Jan
dc.contributor.author Miłkowski, Piotr
dc.contributor.author Kanclerz, Kamil
dc.date.accessioned 2021-04-05T11:04:34Z
dc.date.available 2021-04-05T11:04:34Z
dc.date.issued 2021-03-01
dc.identifier.uri http://hdl.handle.net/11321/798
dc.description MultiEmo, a new benchmark data set for the multilingual sentiment analysis task including 11 languages. The collection contains consumer reviews from four domains: medicine, hotels, products and university. The original reviews in Polish contained 8,216 documents consisting of 57,466 sentences. The reviews were manually annotated with sentiment at the level of the whole document and at the level of a sentence (3 annotators per element). We achieved a high Positive Specific Agreement value of 0.91 for texts and 0.88 for sentences. The collection was then translated automatically into English, Chinese, Italian, Japanese, Russian, German, Spanish, French, Dutch and Portuguese. MultiEmo is publicly available under a Creative Commons Attribution 4.0 International Licence. More information: https://github.com/CLARIN-PL/multiemo Citation: @inproceedings{kocon2021multiemo, title={Multiemo: Multilingual, multilevel, multidomain sentiment analysis corpus of consumer reviews}, author={Koco{\'n}, Jan and Mi{\l}kowski, Piotr and Kanclerz, Kamil}, booktitle={International Conference on Computational Science}, pages={297--312}, year={2021}, organization={Springer} }
dc.language.iso pol
dc.language.iso eng
dc.language.iso zho
dc.language.iso ita
dc.language.iso jpn
dc.language.iso rus
dc.language.iso deu
dc.language.iso spa
dc.language.iso fra
dc.language.iso nld
dc.language.iso por
dc.publisher Wrocław University of Science and Technology
dc.rights The MIT License
dc.rights.uri https://opensource.org/licenses/MIT
dc.rights.label PUB
dc.source.uri https://github.com/CLARIN-PL/multiemo
dc.subject MultiEmo
dc.subject sentiment analysis
dc.subject multilingual
dc.subject benchmark dataset
dc.subject dataset
dc.subject corpus
dc.subject multidomain
dc.subject multilevel
dc.title MultiEmo: Multilingual, Multilevel, Multidomain Sentiment Analysis Corpus of Consumer Reviews
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
hidden false
hasMetadata false
has.files yes
branding CLARIN-PL
demo.uri http://ws.clarin-pl.eu/multiemo
contact.person Jan Kocoń jan.kocon@pwr.edu.pl Wrocław University of Science and Technology
sponsor Ministry of Science and Higher Education (Poland) N/A CLARIN-PL nationalFunds
size.info 82160 texts
size.info 782 mb
size.info 506 files
files.size 422783119
files.count 2


 Files in this item
 Download all files in item (403.2 MB)

This item is
Publicly Available
and licensed under:
The MIT License
Icon
Name
multiemo.7z
Size
143.43 MB
Format
Unknown
Description
7z version
 Download file
Icon
Name
multiemo.zip
Size
259.77 MB
Format
application/zip
Description
zip version
 Download file

Show simple item record