Remarks on Ontology Learning and Evaluation
This post contains some random remarks on ontology learning and evaluation:
- terms versus concepts: concepts are formed by grouping terms with the same meaning
- collocation = co-occurrence of a sequence of words (significant phrase detection)
- kinds of similarity: (i) paradigmatic similarity - terms are substitutable for each other in a given context (e.g. carrot and onion as different kinds of vegetables) (ii) syntagmatic similarity - there is an association between the terms (relatedTo; e.g. "car" and "drive") which can be detected by statistical means (e.g. co-occurrence)
- limited paradigmatic modifiability for detecting collocations (Wermter & Hahn 2005)
- easy to substitute words in a collocation candidate with other words (= high paradigmatic modifiability) => unlikely that the collocation is correct
- limited modifiability => high probability that the collocation is correct
- task-based evaluation
- corpus-based evaluation - compare ontology concepts with corpus concepts
- criteria-based evaluation - based on criteria suggested by the domain experts (e.g. how many terms where aggregated to form a concept, interconnectivity, ...)