Comparison of the papers: [FFF99] discusses power-law relationships of the Internet (i.e. rank, degree, number of pairs within h hops, and eigenvalues), while [BKMR00] focuses on the world-wide-web (in-degree, out-degree, size of components follow power-law distribution, or combination of power-law and Poisson for points close to the origin - e.g. nodes with small in-degree). The second paper tries to go one step further and present the actual topology - one big strongly connected component (25% of nodes) and relationships between the other nodes and this component. Discussion: [both papers] - Why power-law distribution? What is so specific about it? Maybe it has to do something with hierarchical structure or self-similarity. The authors of [FFF99] try to answer a similar question (why any distribution at all? see section 5.1) on intuitive level - that adding a new node(s) triggers a set of changes until a stabilized state is achieved. We are a bit skeptic about this explanation and further investigation in both cases would shed some light onto the process. - Could we take an advantage of power-law distributions, e.g. design algorithms based on the distribution knowledge? There already are some algorithms, a search algorithm that goes to the neighbor of highest degree should be discussed in the next class. [FFF99] - Critique: - How exactly are eigenvalues related to the rest of the study, apart from being another parameter characterizing graph families? In what sense is the knowledge of eigenvalues helpful for applications? We don't know... - Definition of neighborhood (1) is off by factor 1/2. [BKMR00] - Would other crawling techniques give possibly different outcome? We decided that the outcome should be the same (or similar). However, allegedly the intersection of outcomes of different crawling techniques is surprisingly small. - What are the reasons for using BFS instead of DFS when DFS gives us stronger results faster? We don't know and the answer might be related to the use of Altavista's crawling. Or might it be some unknown memory constraints? - Critique: - It is not clear how they deal with sub-pages, e.g. is cnn.com one web-page or tons of web-pages? Possible future directions: - Study coefficients of power-law distributions (changing in time) - See [both], discussion point #1 (read papers [1],[2] suggested by Matei in evaluations section?) - See [both], discussion point #2 - other algorithms? - Other Internet parameters to study and test for power-law (or other) distributions - Relationships between parameters studied? Does power-law distribution of degrees imply power-laws of other parameter(s)?