Idiom Tagger

Idiom Tagger

The idiomTagger is a tool developed within the DSC to automatically detect idiomatic uses of words. The main class is the class IdiomClass, and the main function to detect if a usage of a certain word is idiomatic is IdiomClass.isIdiomStr(…). This function takes 4 parameters: the lemma, the pos, the context and the string used to enclose the lemma in the given context.

An example to use the tagger is:



context=’ blijft de prijs voor een aantal opdrachtgevers een ###drempel###

vormen . Dit is vooral het geval’


objIdioms = IdiomClass(pos)

data = objIdioms.isIdiomStr(lemma,pos,context,enclosedBy)

print data

The returned date is a tuple with the following fields:

+ 1 True/False: if is an idiomatic expression or not

+ 2 An string containing the cannonical form of the idiom in this case

+ 3 The context of the idiomatic expression

+ 4 The lexical unit identifier of the idiom



Leave a Reply