Automatically enriching a thesaurus with information from dictionaries



Regarding that information in broad-coverage knowledge bases, such as thesauri, is usually incomplete, merging information from different sources is a good option to amplify coverage. We propose a method for the enrichment of a thesaurus with information acquired automatically from dictionaries: pairs of synonyms are assigned to candidate synsets and, the pairs whose elements are not in the thesaurus are clustered to identify new synsets. This method was used in the enrichment of a Brazilian Portuguese thesaurus with synonyms from a European Portuguese dictionary, and resulted in a larger and broader thesaurus with new words and new concepts. The assignments and the obtained synsets were manually evaluated and yielded correction scores higher than 71% and 85% respectively.


thesaurus, wordnet, enrichment, synonymy


Natural Language Processing

Related Project



15th Portuguese Conference on Artificial Intelligence (EPIA 2011), Lisbon, Portugal 2011

Cited by

No citations found