Proposed Project

Cornetto will build a lexical semantic database for Dutch, covering 40K entries, including the most generic and central part of the language and a specialized database for the domain of financial law. The databases will contain:
– vertical semantic relations, i.e. hyponym and synonym relations;
– horizontal semantic relations, such as roles, part-whole relations, causal relations;
– combinatorial relations, such as lexical functions, selectional restrictions, collocations, syntactico-semantic frames;
– a top-level ontology;
– an ontological typing of the lexical units;
– a domain ontology (Wordnet Domains);
– a domain labelling of the lexical units;
– an equivalence mapping to synsets in WordNet2.x;
– morpho-syntactic information, including syntactic complementation frames;

Furthermore, Cornetto will include:
– an open-source and public database system with editor, import/export functions and API;
– the methodology and toolkit for acquiring new concepts and relations from corpora;
– the methodology and toolkit for tuning and customizing to a specific domain;

Cornetto will be validated in Information Retrieval and Question Answering applications. The domain database will be evaluated by a user-group of companies that have specifically requested such a domain-tuned database.
________________________________________
Last update: 22 October, 2008, p.vossen (at) let.vu.nl

Leave a Reply