Redundancy-based information extraction

The notion of redundancy-based information extraction utilizes the fact that many information on the Web is redundand, which leads to the consequences that

  • it is sufficient to focus on simple sentences (because data "hidden" in complicated constructs is expected to reappear in simpler form)

