CLARIN-PL Repository Home
View Item
Show/Hide Menu
What can you do?
Browse
All of the Repository
Issue Date
Authors
Titles
Subjects
Publisher
Language
Type
Rights Label
My Account
Login via Your home institution
Register
Statistics
Statistics
BETA
General Information
Deposit
Cite
Submission Lifecycle
FAQ
About and Policies
Help Desk
Vector representations of polish words (Word2Vec method)
CLARIN-PL
Authors
Kędzia, Paweł
;
Czachor, Gabriela
;
Piasecki, Maciej
;
Kocoń, Jan
Date issued
2016-11-07
Type
lexicalConceptualResource
Size
4000000000 tokens
Language(s)
Polish
Description
Model skip gram with vectors of length 100. Trained on kgr 10, a corpora with over 4 billion tokens. Data preprocessing involved segmentation, lemmatization and mophosyntactic disambiguation with MWE annotation.
Publisher
Wrocław University of Technology
Subject(s)
Vector space
Word2Vec
Collection(s)
CLARIN-PL
Show full item record
Files in this item
This item is
Publicly Available
and licensed under:
GNU LGPL 3.0
Name
skipgram_v100.zip
Size
877.62 MB
Format
application/zip
Download file