PolEmo 1.0 + MultiEmo-Test 1.0 Multilingual Sentiment Analysis Dataset for KES2020

Kocoń, Jan; Kanclerz, Kamil; Miłkowski, Piotr; Bojanowski, Bartosz; Zaśko-Zielińska, Monika

dc.contributor.author	Kocoń, Jan
dc.contributor.author	Kanclerz, Kamil
dc.contributor.author	Miłkowski, Piotr
dc.contributor.author	Bojanowski, Bartosz
dc.contributor.author	Zaśko-Zielińska, Monika
dc.date.accessioned	2020-04-02T14:11:13Z
dc.date.available	2020-04-02T14:11:13Z
dc.date.issued	2020-04-02
dc.identifier.uri	http://hdl.handle.net/11321/737
dc.description	PolEmo 1.0 + MultiEmo-Test 1.0: Corpus of Multi-Domain Consumer Reviews. Test dataset from PolEmo 1.0 was translated to eight different languages: Dutch, English, French, German, Italian, Portuguese, Russian and Spanish. Citation: @article{KANCLERZ2020128, title = {Cross-lingual deep neural transfer learning in sentiment analysis}, journal = {Procedia Computer Science}, volume = {176}, pages = {128-137}, year = {2020}, note = {Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 24th International Conference KES2020}, issn = {1877-0509}, doi = {https://doi.org/10.1016/j.procs.2020.08.014}, url = {https://www.sciencedirect.com/science/article/pii/S187705092031838X}, author = {Kamil Kanclerz and Piotr Miłkowski and Jan Kocoń}, keywords = {natural language processing, sentiment analysis, polarity recognition, transfer learning, deep learning, multilingual approach}, abstract = {In this article, we present a novel technique for the use of language-agnostic sentence representations to adapt the model trained on texts in Polish (as a low-resource language) to recognize polarity in texts in other (high-resource) languages. The first model focuses on the creation of a language-agnostic representation of each sentence. The second one aims to predict the sentiment of the text based on these sentence representations. Besides models evaluation on PolEmo 1.0 Sentiment Corpus, we also conduct a proof of concept for using a deep neural network model trained only on language-agnostic embeddings of texts in Polish to predict the sentiment of the texts in MultiEmo-Test 1.0 Sentiment Corpus, containing PolEmo 1.0 test datasets translated into eight different languages: Dutch, English, French, German, Italian, Portuguese, Russian and Spanish. Both corpora are publicly available under a Creative Commons copyright license.} }
dc.language.iso	pol
dc.language.iso	eng
dc.language.iso	nld
dc.language.iso	fra
dc.language.iso	deu
dc.language.iso	ita
dc.language.iso	por
dc.language.iso	rus
dc.language.iso	spa
dc.publisher	Wrocław University of Science and Technology
dc.rights	Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.rights.label	CC
dc.subject	sentiment
dc.subject	sentiment analysis
dc.subject	transfer learning
dc.subject	corpus
dc.subject	multilingual
dc.title	PolEmo 1.0 + MultiEmo-Test 1.0 Multilingual Sentiment Analysis Dataset for KES2020
dc.type	corpus
metashare.ResourceInfo#ContentInfo.mediaType	text
has.files	yes
branding	CLARIN-PL
contact.person	Jan Kocoń jan.kocon@pwr.edu.pl Wrocław University of Science and Technology
sponsor	Ministry of Science and Higher Education (Poland) N/A CLARIN-PL nationalFunds
size.info	60 files
size.info	32134 entries
files.size	14978081
files.count	2

Files in this item

Download all files in item (14.28 MB)

Large Size

The requested files are being packed into one large file. This process can take some time, please be patient.

This item is

Distributed under Creative Commons

and licensed under:
Creative Commons - Attribution 4.0 International (CC BY 4.0)

Name: MultiEmo 1.0.7z
Size: 4.68 MB
Format: Unknown
Description: Unknown

Download file

Name: MultiEmo 1.0.zip
Size: 9.61 MB
Format: application/zip
Description: Unknown

Download file

Show simple item record

Files in this item Download all files in item (14.28 MB) × Large Size The requested files are being packed into one large file. This process can take some time, please be patient. Continue Cancel

Files in this item

Download all files in item (14.28 MB)

Large Size

The requested files are being packed into one large file. This process can take some time, please be patient.