Project database

The page provides a list of national and international projects where FER participates or has participated as a project coordinator or partner.


Projects

   

Project

Acronym:
SenseHive 
Name:
SenseHive: Dynamic Crowdsourcing Models for Incremental Construction of Lexico-Semantic Resources 
Project status:
From: 2015-10-01 To: 2018-09-30 (Completed)
Type (Programme):
HRZZ 

Croatian partner

Organisation name:
Contact person name:
doc. dr. sc. Jan Šnajder
Contact person tel:

Short description of project

Lexico-semantic resources play an essential role in natural language processing and related applications such as information retrieval. Unfortunately, their construction is extremely costly and rarely guided by practical considerations, posing a problem especially for less-resourced languages. One possible solution is to rely on crowdsourcing of lexico-semantic resources. Although crowdsourcing has proven to be a viable option for reducing the overall costs, there still does not exists a comprehensive crowdsourcing methodology for incremental construction of large-scale lexico-semantic resources. This projects aims to fill this gap by investigating the computational models and methods for incremental and cost-efficient crowdsourcing of lexico-semantic resources. The research will combine dynamic crowdsourcing, corpus-based models of semantics (distributional semantics and topic models), and active machine learning methods into a comprehensible and language-independent crowdsourcing framework, the SenseHive. The SenseHive consists of a flexible, graph-based representation of senses and lexico-semantic relations (SenseGraph), coupled with an incremental construction methodology. In SenseGraph, senses are dynamically split up and merged based on the analysis of human judgments on corpus-extracted data. In the first phase, we will implement a prototype of the SenseHive framework and use it for focused evaluation experiments on Croatian, Slovene, and English data to answer the relevant research questions. As a proof of concept, in the second phase we will use SenseHive to construct a medium-sized lexico-semantic resource for Croatian by enlarging and enriching existing lexico-semantic resources. The proposed research will advance the state of the art in computational lexical semantics and semi-automated construction of linguistic resources, and yield a lexico-semantic resource for Croatian of great practical value.