This book covers the following topics:
- evolution of web structure and content, presenting papers on
- a) the size of the web,
- b) methods for the mining of web communities,
- c) theory of random networks, and
- d) web dynamics, structure and page quality.
- a) navigating the World Wide Web,
- b) Crawling the Web,
- c) the combination of link and content information in Web Search, and
- d) ways how search engines cope with the changes in the Web (site updates, etc.)
Detailed description of relevant papers:
Methods for Mining Web Communities (part I) - Bibliometrics, Spectral, and Flow This paper gives a good overview over Methods for mining web communities.
- Bibliographic metrics (bibliographic coupling and cocitation coupling) and basic concepts like bipartite cores are explained.
- Spectral methods like HITS (Hyperlink-Induced Topic Search) and PageRank are presented and common interpretations for both methods are presented.
- finally Maximum Flow methods for identifying communities are explained.
Web Dynamics, Structure and Page Quality (part I) This paper presents a model for Web page change, sketched on the basic operations "creation", "updates" and "deletions" (of web pages). Afterwards methods for measuring and estimating the rate of change is introduced and relationships between the Web's structure (components: MAIN, IN, OUT, other-class sites) and the Web site age are investigated. Finally the correlation between link-based ranking (measures: HITS and PageRank) and Age are presented.