Digital Intuition: Applying Common Sense Using Dimensionality Reduction
Havasi, C., Pustejovsky, J., Speer, R., & Lieberman, H. (2009). Digital Intuition: Applying Common Sense Using Dimensionality Reduction. Intelligent Systems, IEEE, 24(4), 24 —35. doi:10.1109/MIS.2009.72
This article discusses
- the Open Mind Common Sense (OMCS) project - an initiative for collecting common sense data, and
- the use of singular value decomposition (SVD) for
- predicting the value of unspecified entries
- combining these data with other knowledge sources through a process called blending
The Open Mind Common Sense Project
The OMCS project collects common sense knowledge such as "cars have wheels", "humans breathe air", ... from free text and templates that users fill in. ConceptNet consists of concepts and a set of 20 standardized relations (e.g. IsA, UsedFor, HasA, CapableOf, ...) between these concepts. In addition it contains information on the relations' frequency and confidence scores.Combining Common Sense Knowledge
The techniques presented in this paper work on matrices (concept x instance) that contain concepts and the corresponding instances or features. Therefore, we need to transform relations (concept1, relation, concept2) to features used in a matrix representation. For instance, the relation (wheel, partOf, car) would yield the class "is part of car" and its instance "wheel".The authors use single value decomposition (SVD) - a technique that is also used in methods such as LSA - to create an approximation of the original matrix that contains its principal components and, therefore, a low-rank approximation of the original data. Visualizing the data on the dimensions yielded from SVD helps suggesting related instances and concepts. It is important to note, that for blending to work, a certain overlap of the data is required.
Background: Single Value Decomposition
Single Value Decomposition factors a matrix A into two orthonormal matrices U and V that are connected by a diagonal matrix Σ:A=UΣV^T
The singular values in Σ correspond to the importance of the corresponding dimension. Ordering these values from largest to smallest and creating a matrix of the first (most important) k components, therefore, yields a smaller matrix A_k that is the nearest least-square approximation to A of rank k.