Created by CLARIN-PL LTC, PolLinguaTech Knowledge Centre obtained the certificate of CLARIN K Centre

PolLinguaTech was founded to provide researchers with the crucial knowledge about the applications of natural language processing (NLP) with a special emphasis put on Polish language processing. The Centre provides indispensible documentation  (technical documents, user manuals), organises trainings and offers help and advice on each stage of research.

PolLinguaTech functions at CLARIN-PL Language Technology Centre – one of the leading language technology research and development centers focusing on Polish language. The offer of PolLinguaTech is identical with the current offer of LTC. Gaining the Clarin K Centre Certificate will allow to better coordinate activities dedicated to the promotion of knowledge about language technologies and the institutionalised form of user assistance.

We encourage you to take advantage of the offer of CTJ and PolLinguaTech and to contact our employees.

CTJ CLARIN-PL LTC webpage: LINK

PolLinguaTech – Knowledge webpage: LINK

Clarin K Centre Certificate: LINK

The 6th CLARIN Annual Conference – 2nd CALL FOR PAPERS (Budapest, 18th-20th September 2017)

2nd CALL FOR PAPERS

CLARIN ERIC is happy to announce the 6th CLARIN Annual Conference and invites submission of papers.

LOCATION

The 6th CLARIN Annual Conference will be held in Budapest, Hungary.

IMPORTANT DATES

  • 1st February, 2017 First call published and submission system open
  • 8th May, 2017 Submission deadline
  • 24th June, 2017 Notification of acceptance
  • 1st September, 2017 Final version of extended abstracts due
  • 18th–20th September 2017 CLARIN Annual Conference

CONFERENCE AIMS

The CLARIN Annual Conference is organised for the Humanities and Social Sciences community in order to exchange ideas and experiences on the CLARIN infrastructure. This includes its design, construction and operation, the data and services that it contains or should contain, its actual use by researchers, its relation to other infrastructures and projects, and the CLARIN Knowledge Sharing Infrastructure.

CONFERENCE TOPICS

Operation and use of the CLARIN infrastructure, e.g.

  • Use of the CLARIN infrastructure in humanities and social sciences research, including needs for updated and new functionality
  • Usability studies and evaluations of CLARIN services
  • Analysis of the CLARIN infrastructure usage, identification of user audience and impact studies
  • Showcases, demonstrators and research in humanities and social sciences that is relevant to CLARIN
  • Models for the sustainability of the infrastructure, including issues in curation, migration, financing and cooperation
  • Legal and ethical issues in operating the infrastructure

Design and construction of the CLARIN infrastructure, e.g.

  • Metadata and concept registries, cataloguing and browsing
  • Persistent identifiers
  • Access, including Single Sign On Authentication and Authorisation
  • Search, including Federated Content Search
  • Web applications, web services, workflows and use of the infrastructure
  • Standards and solutions for interoperability of language resources, tools and services

CLARIN Knowledge Infrastructure and Dissemination, e.g.

  • User assistance (helpdesks, user manuals, FAQs)
  • CLARIN portals and outreach to users
  • Videos, screen casts, recorded lectures
  • Researcher training activities
  • Knowledge infrastructure centres

CLARIN in relation with other infrastructures and projects, e.g.

  • Relations with other SSH research infrastructures such as DARIAH, CESSDA, etc.
  • Relations with meta-infrastructure projects such as EUDAT and RDA
  • Relations with national and regional initiatives

THEMATIC SESSION: Multilingual Processing for Humanities and Social Sciences

The Humanities and Social Sciences (H&SS) have formulated research questions pertaining to different languages. However, the number of research tasks in H&SS in which Language Technology has been applied to cross-language barriers and analyse the same phenomena on material expressed in different languages is relatively small. The situation is better in the case genuine linguistic research, but in multilingual research applications in H&SS are mostly based on a kind of ‘bag of words’ model, and very rarely utilise more advanced multilingual Language Technology methods.

The general aim of this thematic session is to present examples of multilingual approaches in H&SS research relevant to CLARIN, and to discuss infrastructural solutions to the problem of multilingual interoperability of the Language Technology that are necessary for more advanced research in H&SS. We expect to organise presentations and discussions during the session on the following aspects:

  1. Examples of applications of Language Technology to multilingual processing for the needs of research in H&SS.
  2. Research tasks and ongoing projects in H&SS on the basis of multilingual material and application of Language Technology.
  3. Interoperability of language resources and tools for the needs of multilingual applications in H&SS: models for linking, standards and formats, mapping and linking algorithms, complex processing methods, architectures and platforms.

We invite submissions describing CLARIN relevant work addressing these aspects. Submissions (for oral presentations, posters, or demos) intended for the thematic session should be marked as such, and will be evaluated with respect to their appropriateness for the theme, in addition to the general acceptance criteria listed below.

PROGRAM

The scientific program both of the general sessions and the thematic session will include oral presentations, posters, and demos. There is no difference in quality between oral and poster presentations. Only the appropriateness of the type of communication (more or less interactive) to the content of the paper will be considered.

SUBMISSIONS

Submission of proposals for oral presentations, poster presentations and/or demos must be extended abstracts (length: up to four A4 pages including references) in PDF format, in accordance with the template provided on the website.

It is not required that the authors are or have been directly involved in national or international CLARIN projects, but their work must be clearly relevant to the CLARIN activities, resources, tools or services.

Extended abstracts must be submitted through the EasyChair submission system (link) and will be reviewed by the program committee.

All proposals will be reviewed on the basis of both individual criteria and global criteria. The latter include thematic, linguistic and geographical spread. Individual acceptance criteria are the following:

  • Appropriateness: the contribution must pertain to the CLARIN infrastructure or be relevant for it (e.g. use CLARIN, contribute to the CLARIN design, construction, operation, exploitation, illustrate possible applications etc.). In addition, submissions to the thematic session will be selected on the basis of their appropriateness to the theme.
  • Soundness and correctness: the content must be technically and factually correct and methods must be scientifically sound, according to best practice, and preferably evaluated.
  • Meaningful comparison: the abstract must indicate that the author is aware of alternative approaches, if any, and highlight relevant differences.
  • Substance: concrete work and experiences will be preferred over ideas and plans.
  • Impact: contributions with a higher impact on the research community and society at large will be preferred over papers with lower impact.
  • Clarity: the extended abstract must be informative, clear and understandable for the CLARIN audience.
  • Timeliness and novelty: the work must convey relevant new knowledge to the audience at this event.

ATTENDANCE

One author per each accepted paper will be granted free participation in the conference and meals. Selected authors will be granted reimbursement of travel (up to 220 Euros) and accommodation costs.

PROCEEDINGS

If the submission is accepted, it will be published (possibly in revised form) in the conference Book of Abstracts. After the conference, the author(s) will be invited to submit a full paper (max. 12 pages) to be reviewed according to the same criteria as the abstracts. Accepted full papers will be digitally published in a conference proceedings volume at Linköping University Electronic Press within about 6 months after the conference.

CONFERENCE PROGRAM COMMITTEE

The program committee for the conference consists of the following members:

  • Jan Theo Bakker, Dutch Language Union, The Netherlands/Flanders
  • Lars Borin, University of Gothenburg, Sweden
  • António Branco, University of Lisbon, Portugal
  • Koenraad De Smedt, University of Bergen, Norway
  • Tomaž Erjavec, Jožef Stefan Institute, Slovenia
  • Eva Hajičová, Charles University Prague, Czech Republic
  • Erhard Hinrichs, University of Tübingen, Germany
  • Krister Lindén, University of Helsinki, Finland
  • Bente Maegaard, University of Copenhagen, Denmark
  • Monica Monachini, Institute for Computational Linguistics «A. Zampolli», Italy
  • Karlheinz Mörth, Austrian Academy of Sciences, Austria
  • Jan Odijk, Utrecht University, the Netherlands
  • Maciej Piasecki, Wrocław University of Science and Technology, Poland (chair)
  • Stelios Piperidis, ILSP, Athena Research Center, Greece
  • Kiril Simov, IICT, Bulgarian Academy of Sciences, Bulgaria
  • Inguna Skadiņa, University of Latvia, Latvia
  • Jurgita Vaičenonienė, Vytautas Magnus University, Lithuania
  • Tamás Váradi, Research Institute for Linguistics, Hungarian Academy of Sciences
  • Kadri Vider, University of Tartu, Estonia
  • Martin Wynne, University of Oxford, UK

LINKS

Challenges for Wordnets. A workshop co-located with LDK 2017

Challenges for Wordnets

A workshop co-located with LDK 2017:
The first conference on Language, Data and Knowledge

18 June, Galway, Ireland

PROGRAM

Wordnets

Wordnets are increasingly widely used to model word meaning in natural language processing tasks. However, there are still many challenges in accurately describing word meanings and making these descriptions useful for both human and machine users. This workshop aims to identify, discuss and start to solve existing challenges.

CALL FOR PAPERS

Much research on wordnets focuses either on their construction or their use in some application. Few papers bridge the gap by discussing how different wordnet models and construction methods affect their effectiveness in use, or how different applications require different parts of language to be modeled.

For this reason, we are experimenting with a new workshop series challenges for wordnet, to give wordnet users and developers a chance to share experiences both good and bad. It will be co-located with the First Conference on Language, Data and Knowledge (LDK 2017) in Galway, Ireland. The workshop will start with some short presentations and then finish with an extended discussion based on the challenges presented. We welcome position statements on the following (or related) topics:

  • Issues for Modeling Languages
    – missing parts-of-speech in wordnet (e.g. prepositions, conjunctions)
    – incomplete representation (e.g. semantics of adverbs)
    – links to examples/corpora
    – what is a wordnet
    – basic building blocks of a wordnet
    – language/dialect differences
  • Issues of Compatibility
    – integration with other resources
    – wordnets vs ontologies
    – licenses
  • Application Issues
    – consistent coverage
    – named entities
    – scaling up
    – wordnet services: WSD, similarity…
    – maintenance
    – versionings and updating
  • Evaluation
    – quality measures
    – experts vs crowds
    – translation vs monolingual construction

 

Please submit papers of between 6-10 pages, excluding references, formatted using the Springer Lecture Notes in Artificial Intelligence formatting guidelines. Submissions should be anonymous. Submissions will be reviewed by at least 3 reviewers and will be made available on online prior to the workshop.

Authors of good submisions will be invited to submit extended versions for a special issue of the Cognitive Studies | Études cognitives journal. The extended versions will be carefully peer reviewed, but the scope of this special issue will be set in advance.

Papers should be submitted via EasyChair

The workshop is supported by the CLARIN-PL research infrastructure.

LDK 2017 conference received support from the Global WordNet Association Board

KEYNOTE SPEAKER

prof. Eduard Hovy, Carnegie Mellon University, Language Technologies Institute

IMPORTANT DATES

Paper submission: 10 April
Notification of Acceptance: 30 April
Workshop Date: 18 June 14:00-18:00
ORGANIZING COMMITTEE

Maciej Piasecki – Wroclaw University of Technology
Francis Bond – Nanyang Technological University
John P. McCrae – Insight Centre for Data Analytics, National University of Ireland, Galway
Jan Wieczorek – Wroclaw University of Technology

PROGRAM COMMITTEE

Eneko Agirre – University of the Basque Country
Sonja Bosch – University of South Africa, Pretoria
Christiane D. Fellbaum – Princeton University
Darja Fišer – University of Ljubljana
Antoni Oliver Gonzalez – Open University of Catalonia
Shu-Kai Hsieh – National Taiwan University
John P. McCrae – Insight Centre for Data Analytics, National University of Ireland, Galway
Verginica Mititelu – Romanian Academy
Monica Monachini – National Research Council of Italy
Adam Pease – Articulate Software
Bolette Sandford Pedersen – University of Copenhagen
Maciej Piasecki – Wroclaw University of Technology (chair)
German Rigau – Polytechnic University of Catalonia
Ewa Rudnicka – Wroclaw University of Technology
Shikhar Kr. Sarma – Gauhati University
Stanisław Szpakowicz – Emeritus Professor, University of Ottawa
Veronika Vincze – University of Szeged
Piek Th. J. M. Vossen – VU University Amsterdam

 

Contact:

Address any questions to Maciej Piasecki and Francis Bond at <clarin-pl@pwr.edu.pl>.

We have made available a vector word model learned with neural networks

The published model describes words of natural language by means of multidimensional vectors. The vectors refer to words’ hidden properties which are motivated by the layers of the neural network. The model may be used for determining the degree of word similarity to which specific vectors relate and thus for the generation of word frequency lists.

The model is available at our repository on the D-Space platform: https://clarin-pl.eu/dspace/handle/11321/327

Privacy policy

Name of the service

CLARIN-PL/CLARIN repository and services

Description of the service

Repository and services of CLARIN-PL/CLARIN project at Worclaw University of Technology

Data controller and a contact person

Institute of Computer Science
Wyb. Wyspiańskiego 27
50-370 Wroclaw
Poland

Technical contact:
Marcin Pol
macin.pol at pwr.edu.pl

Jurisdiction

PL, Poland

Personal data processed

The following personal information is fetched from the Identity Provider server of your home organisation every time you log in to the service:

  • name
  • email address
  • unique ID to identify the user

Purpose of the processing of personal data

Repository and services process personal data and log files for:

  • user authentication
  • user identification when signing licenses
  • user authorisation in special cases
  • automated sending of email messages necessary for use of the services (password reset, submission information etc.)
  • monitoring the load and capacity of the underlying server and the network
  • working out technical problems and investigating abuse of the service
  • statistics and development of the service

Third parties to whom personal data is disclosed

The personal data are not disclosed to anyone outside of the CLARIN-PL/CLARIN team.

How to access, rectify and delete the personal data

The personal information is visible in your profilei. If there is no profile, no personal data is directly stored. Use the technical contact above for any requests. To rectify the data released by your Home Organisation, contact your Home Organisation's IT helpdesk.

Data retention

Personal data is deleted on request of the user or if the user hasn't used the service for five years.

Data Protection Code of Conduct

Your personal data will be protected according to the Code of Conduct for Service Providers, a common standard for the research and higher education sector to protect your privacy.

Tagger

The service performs tokenization and morpho-syntactic tagging for Polish language using WCRF Taggertool.

Before analysis After analysis
								

Ala ma kota.
								
							
								

<?xml version="1.0" encoding="UTF-8"?> 
<!DOCTYPE chunkList SYSTEM "ccl.dtd"> 
<chunkList> 
 <chunk id="ch1" type="p"> 
  <sentence id="s1"> 
   <tok> 
    <orth>Ala</orth> 
    <lex disamb="1"><base>Ala</base><ctag>subst:sg:nom:f</ctag></lex> 
   </tok> 
   <tok> 
    <orth>ma</orth> 
    <lex disamb="1"><base>mieć</base><ctag>fin:sg:ter:imperf</ctag></lex> 
   </tok> 
   <tok> 
    <orth>kota</orth> 
    <lex disamb="1"><base>kot</base><ctag>subst:sg:acc:m2</ctag></lex> 
   </tok> 
   <ns/> 
   <tok> 
    <orth>.</orth> 
    <lex disamb="1"><base>.</base><ctag>interp</ctag></lex> 
   </tok> 
  </sentence> 
 </chunk> 
</chunkList>

								
							

In this web client for service called Tagger WCRF Tagger.

Wygeneruj nowe zapytanie



Tekst został zatwierdzony. Użyj poniższego formularza, aby sprawdzić wynik.

Rezultat

Rezultat

W przypadku, gdy potrzebne jest zautomatyzowane przetwarzanie na większa skalę można połączyć się z naszą usługą poprzez protokół SOAP lub REST. Usługa dostępna jest pod tym linkiem: WSDL


Authors of tool: Adam Radziszewski, Radosław Warzocha