Towards the automatic enrichment of a thesaurus with information in dictionaries



Regarding that information in broad-coverage knowledge bases, such as thesauri, is usually incomplete, merging information from different sources is an alternative to amplify coverage. We propose a method for the enrichment of a thesaurus with information acquired automatically from dictionaries. First, synonymy pairs are extracted. Then, these pairs are assigned to the most similar candidate synsets. Finally, the remaining pairs are the target of clustering to identify new synsets. After selecting the adequate experimentation settings, this method was applied to enrich a Portuguese thesaurus with synonyms extracted from three dictionaries, which resulted in TRIP, a larger and broader thesaurus with new words and concepts. The steps towards the creation of this new thesaurus and its evaluation are described here.


thesaurus, enrichment, synonymy, words, ontologies


Natural Language Processing

Related Project



Expert Systems, Vol. 30, #4, pp. 320-332, Jon G. Hall, May 2013


Cited by

Year 2017 : 1 citations

 Hetsevich, Y. and Reentovich, I. (2016). Linguistic analysis for the be- larusian corpus with the application of natural language processing and machine learning techniques. (Informatics), 56(4):64–69.