dc.contributor.author |
Kędzia, Paweł |
dc.contributor.author |
Czachor, Gabriela |
dc.contributor.author |
Piasecki, Maciej |
dc.contributor.author |
Kocoń, Jan |
dc.date.accessioned |
2016-11-07T10:34:36Z |
dc.date.available |
2016-11-07T10:34:36Z |
dc.date.issued |
2016-11-07 |
dc.identifier.uri |
http://hdl.handle.net/11321/327 |
dc.description |
Model skip gram with vectors of length 100. Trained on kgr 10, a corpora with over 4 billion tokens. Data preprocessing involved segmentation, lemmatization and mophosyntactic disambiguation with MWE annotation. |
dc.language.iso |
pol |
dc.publisher |
Wrocław University of Technology |
dc.rights |
GNU LGPL 3.0 |
dc.rights.uri |
http://www.gnu.org/licenses/lgpl.html |
dc.rights.label |
PUB |
dc.subject |
Vector space |
dc.subject |
Word2Vec |
dc.title |
Vector representations of polish words (Word2Vec method) |
dc.type |
lexicalConceptualResource |
metashare.ResourceInfo#ContentInfo.detailedType |
wordList |
metashare.ResourceInfo#ContentInfo.mediaType |
text |
hasMetadata |
false |
has.files |
yes |
branding |
CLARIN-PL |
contact.person |
Paweł Kędzia Pawel.Kedzia@pwr.edu.pl Wrocław University of Technology |
size.info |
4000000000 tokens |
files.size |
920249571 |
files.count |
1 |