Machine idioms

Machine Idioms contains all the idioms detected automatically in the SONAR corpus in a DSC-xml file. For each idiom detected the following information is stored:

+ form: the canonical form of the idiom

+ idiom_id: the identifier of the idiom in Cornetto

+ lemma: the lemma of the example

+ pos: the part-of-speech

+ token_id: the token identifier of the example in the SONAR corpus


(Download in DSC-XML)

Leave a Reply