Date: Tue, 16 Apr 2002 16:12:16 -0500 (CDT)
From: Catalin Lucian Dumitrescu <catalind@cs.uchicago.edu>


On Power-Law Relationships of the Internet Topology
---------------------------------------------------

1. State the main contribution (or focus?)

 The paper presents several observations about the Internet topology, and
 introduces the analysis of its properties by means of power law
 distributions of outdegree of nodes, frequency of an outdegree, and on
 the number of hopes between pairs of nodes. From these measurements
 several formulas for number of edges (N, rank exponent), or effective
 diameter (N, E) were derived. 


2. Critique the main contribution

  a. Rate signifiance
    - 3

  b. Rate the methodology 
    - 4


3. Three strongest points
  - the tree data sets collected over a year
  - derived formulas 


5. Three questions for the author
  - I am not sure if there cannot be better approximation


6. Detail an interesting future work 
  - 


------------------------------------------------------------------------------

Graph Structure in the Web
--------------------------

1. State the main contribution (or focus?)

 Analysis of the structure of the web by constructing its graph and
 classifying pages in different classes (core, in, out, tendrils). Three
 algorithms (BFS, WCC, SCC) are used to detect connected components, weak
 cc, strong cc and to measure different characteristics.


2. Critique the main contribution

  a. Rate signifiance
   - 2

  b. Rate the methodology 
   - 2


3. Three strongest points
  - 

5. Three questions for the author
  - 


#################################################
From: "Adriana Iamnitchi" <anda@cs.uchicago.edu>
To: "Adriana Iamnitchi" <anda@cs.uchicago.edu>
Subject: evaluations for Wed.
Date: Mon, 15 Apr 2002 18:02:31 -0500


[FFF99] Michalis Faloutsos, Petros Faloutsos, Christos Faloutsos, On 
Power-Law Relationships of the Internet Topology. (PDF) ACM SIGCOMM'99
1. State the main contribution of the paper
The paper evaluates the Internet graph as of 1998 and shows its 
power-law characteristics at different moments in time (Nov. 97, April 
98 and Dec 98) with respect to various graph parameters (node outdegree, 
outdegree rank and frequency)

2. Critique the main contribution. 
a. 4 (significant contribution). This paper's contribution is 
significant in many respects:20
    i.    It highlights the erroneous and (until then common) practice 
of modeling Internet topology based on hierarchies and uniform node 
degree
    ii.   It proposes a more accurate model for Internet topology
    iii.  It proposes some graph metrics that better incorporate graph 
density

b. The methodology is convincing: 3 snapshots of the network in a period 
of significance growth of the Internet are sufficient to show the 
characteristics of the Internet during that period of time. The authors 
are careful in stressing that while they have no reasons to believe 
these characteristics will be maintained in the future, they also stress 
the fact that currently (as of '99) there are no reasons to believe 
these characteristics will change.

c. What is the most important limitation of the approach?
Perhaps not very significant, but the '95 snapshot at the routers' level 
may not be very relevant: the Internet grew significantly after '95, 
with various structural changes (e.g., NSFNet). I have no reasons to 
believe the results from '95 are obsolete now (i.e., there is no 
power-law at the routers' level), but neither that they are still true. 
A more recent snapshot would have been perhaps more convincing.

3. I've heard of the power-law result from other papers that cited this
one, so for me the three most interesting ideas when I actually read the
paper were:
    (1). The alternative graph metrics that approximate better the 
Internet than the traditional metrics of average node degree.
    (2). The effective diameter formula that encapsulates information 
on, for example, the optimum TTL for flooding the network efficiently.
    (3). The authors' explanation for the power-law as capturing the 
equilibrium of the tradeoffs between the gain and cost of adding a new 
edge from financial and functional perspective.

4. What are the three most striking weaknesses in the paper?
    (1). I did not understand the motivation for excluding the small 
percentage (apparently < 2%) of nodes with higher outdegree and 
frequency 1. This trick contributed to the "perfection" of the power-law 
plots, but for what accuracy cost?
    (2). I suspect some of the results are interrelated and I would have 
liked to understand in what measure: e.g., outdegree vs. rank and 
frequency vs. outdegree seem to be strongly related (or aren't they?)
    (3). I did not understand what was the analysis of the Internet that 
"suits" the graph generator models, and from here the significance (or 
even the meaning) of the results that show that 40-50% of nodes are in 
trees -- what trees? This explanation would have added more information 
than Table 1 and/or 2...


7. Optional comments on the paper that you'd like to see discussed in 
class.
    * Do we have any reasons to believe their result as of 98 is 
outdated? (i.e., the Internet is not a power-law anymore, or the 
exponents changed significantly) Note that a result by Barabasi & all 
shows that dynamic networks  show the power-law characteristics in the 
presence of 2 simultaneous conditions: a) growth; b) preferential 
attachments
    * The authors claim an estimation of the number of edges within 
9-20% of the real number: isn't 20% too far?
    * Are the predictions in Table 4 or Table 5 correct?


[BKMR00] Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, 
Sridhar Rajagopalan, Raymie Stata, Andrew Tomkins, Janet Wiener. 2000 
Graph structure in the web Comput. Netw. 33 30920
  1.. State the main contribution of the paper: the paper looks at the 
web graph and studies a set of characteristics, like the node degree 
distribution (confirming it's a power-law) and reachability and distance 
between nodes. The paper sketches a map of the web that reveals 
interesting characteristics (e.g., a significant portion of the graph is 
disconnected)
  2.. Critique the main contribution. 
    1.. 4 (significant contribution): the paper contradicts an earlier 
result by Barabasi & others that showed the web is more connected than 
this paper reveals to be. Knowing the approximate map of the web is 
relevant to web crawlers, designing new algorithms, predicting the 
evolution of the web, sociology.20
    2.. Rate how convincing the methodology is: the methodology is 
convincing. However, although the graph studied is larger than in the 
previous studies, it would have been more interesting to see an even 
larger snapshot of the Internet, by merging, for example, Altavista data 
with some other web crawlers' (Google already existed in 2000...) I 
think my worry is: is it anything in the Altavista crawler algorithm 
that favors some parts of the network or subjects, etc., that would lead 
to a biased snapshot of the web?20
    3.. Most important limitation of the approach: See above.
  3.. What are the three strongest and/or most interesting ideas in the 
paper?20
    1.. An interesting result is due to studying the directed graph (the 
previous results treated the web as an undirected graph, I think?)
    2.. A significant fraction of the web is disconnected.
    3.. The map of the web looks more like an organism than like a 
convex polygon (as I was expecting....)
  4.. What are the three most striking weaknesses in the paper? The 
results are too straightforward (although somewhat surprising) so I 
can't seem to find striking weaknesses. I'm rather surprised that the 
original paper that showed the web is a small world based its results 
only on the local graph, within nd.edu, which is hardly a representative 
snapshot of the Internet -- due to small size and particularity of the 
domain (educational).


##########################################################

Date: Tue, 16 Apr 2002 20:49:34 -0500 (CDT)
From: Rahul Santhanam <rahul@cs.uchicago.edu>
To: Anda Iamnitchi <anda@cs.uchicago.edu>
Subject: Feedback for both papers!

Faloutsos-Faloutsos-Faloutsos:

1. The authors study the topology of the internet. They show that the
inter-domain graph is characterised by power laws expressing (1) the
relationship between the degree of a node and its rank (2) the
relationship between degree and the frequency with which this degree
occurs in the graph (3) The relationship between the number of pairs of
nodes within distance $d$ and $d$ (4) The relationship between eigenvalues
of the graph and the ranks of eigenvalues. The power laws are shown to
hold for measurements at three different times over the course of a year,
and the claim is made that power laws are therefore fundamental properties
of Internet topology.

2.a. 4 - significant contribution. The paper contains new and important
insights about the structure of the Internet.

  b. The methodology is convincing. The power-law plots fit the data very
closely. The fact that the power-law exponents change gradually over time
is further evidence that something fundamental is being measured in the
paper.

  c. One limitation of the approach is that it does not help in predicting
the structure of the Internet in the future. Perhaps more measurements at
different times will reveal structure in the evolution of power
exponents..

3. The most interesting idea in the paper is that the distribution of so
many different parameters of the Internet topology can be described very
accurately by power laws - thus the Internet displays an unexpected
degree of structure. Another interesting idea is that graph metrics
are highly skewed and hence should be described using the power law
exponents rather than averages.
 
4. Weaknesses - 1. It's not clear how the parameter $P(h)$ is an adequate
measure of the density of the graph, especially since it is estimated only
for very small values of $h$. 2. The authors do not provide a convincing
explanation of how the power-law behavior arises 3. Rather than supplying
data toward fine-tuning parameters for graph generators that presumably do
not generate power-law graphs, the authors could have used their insights
to design a new one.

6. One possible extension of the work is to analyze power-law graphs in a
theoretical setting and show correspondences between the various power
laws.


Broder-Kumar-Maghoul-Raghavan et al. :

1. The authors study the structure of the World Wide Web as a graph. They
confirm previous findings of power law relationships between various
parameters of the graph but suggest that "small world" phenomena do not
actually hold, in the sense that the diameter of the graph is large even
if the average diameter is small. They identify and analyse components of
various types and sizes in the graph (depending on whether the graph is
taken as directed or undirected).

2. a. 3 - modest contribution. The authors provide much data but little
insight about the structure of the Web as a graph.

b. The experiment is fairly well-designed but the authors do not properly
identify sources of bias.

c. The most important limitation of the approach is that the
experiments are done over a limited time period. The results are unlikely
to be instructive because of the dynamic nature of the web. (Also, all
data is provided by Altavista - it would be nice to have other data to
compare this to).

3. Strength - The experimental data is painstakingly analysed. The
persistence of the power law exponent (also, its scale-invariance) for
indegrees of vertices in the graph is rightly identified as a compelling
phenomenon.

4. Weaknesses - 1. The authors provide vast amounts of data,
especially w.r.t the components of the web graph, but do not interpret it
sufficiently 2. The numbers are given to a degree of accuracy that is
meaningless considering the scope of the experiment 3. Sometimes the
authors are not precise about their methodology - for eg., which crawl is
used for running the WCC and SCC algorithms?

6. Possible extension - To consider alternative sources of data; also to
analyse other parameters of the graph such as density.

Rahul.

################################################

Date: Tue, 16 Apr 2002 22:14:14 -0500 (CDT)
From: Xinghua Shi <shi@cs.uchicago.edu>
To: Adriana Iamnitchi <anda@cs.uchicago.edu>
Subject: Summary5

Paper1: On Power-law Relationships of the Internet Topology.

Contribution: The authors discovered some very simple power-laws of the
Internet topology and used the exponents of the power -laws to characterize
the Internet. This is really a novel way to study the Internet topology which
finds out some order and principle from chaos. They study the Internet
topology both at the router level and inter-domain level which covers the whole
network in general. This paper is somewhat well-founded  because ,it is based
on sound mathematical analysis. Besides the study in the theoretical
analysis, the paper also presents a number of practical applications of the
power-laws which is useful for further research.

Rate:
a. 4 in significance: This paper has significant contribution because it
provides a new way in the research of the Internet topology.
b. 4 in convincing of methodology: The argument is based on sound
mathematical definition and proof. The claims and conclusions follow from the
experiments mentioned. The experiments are well designed but the assumptions are
based on the data in the experiments, which may be not correct as the
Internet evolves.
c. The most important limitation of the approach is that there're too many
definitions and terminology in the proof.
The strongest and/or most interesting ideas in the paper:
a. The three power-laws that characterize the inter-domain Internet
topology
b. The power-laws hold for three Internet instances with high correlation
coefficients
c. The power-laws can be used to characterize graph topologies
The most striking weaknesses in the paper:
a. Too many concepts and terminology
b. Many proofs begin with the observation from intuition which may be
misleading.

Questions:
Are the power-laws really correct when the Internet becomes more instable
or expands too quickly?


Paper2: Graph structure in the web.

Contribution: The authors generated results and analysis based on their
three sets of experiments on web crawls. All these show a macroscopic
structure of the Internet. They studied both directed and undirected connected
components of the web in the second set of experiments. This is somewhat covers
the whole web.

Disadvantages:
Since the conclusions may be misleading if they're drawn from the study of
the local structure of the Internet. We can imagine that as the Internet
becomes much huge than at present, the current results the author present are
probably incorrect or at least biased.

Rate:
a. 3 in significance: This paper has moderate contribution because it
helps the understanding of the Web topology but is too straightforward and
trivial.
b. 3 in convincing of methodology: The argument is based on the results of
experiments. But the argument is merely based on experiental data and
results, which lack formal proof.
c. The most important limitation of the approach is that there're too many
definitions and terminology in the proof.
The strongest and/or most interesting ideas in the paper:
a. The connectivity of the web graph as an undirected graph is extremely
resilient and does not depend on the existence of nodes of high in-degree.
b. The nodes mentioned above with high PageRank or nodes that are
considered good hubs and authorities are embedded in a graph that is well connected
without them.
c. The structure of SCC,IN, OUT and the analysis on this
The most striking weaknesses in the paper:
a. There're some trivial parts such as the brief primer on graphs and
terminology.
b. The argument lacks sound proof or analysis based on mathematical model.
Questions:
How the current results will be as the development of the Internet? Are
their conclusions still hold or will become incorrect?How much the  scope of
the Internet have on the conclusions? Are there some thresholds or
limitations of the scope which confine the conclusions?
Are the IN, OUT sets reasonable or actually correct in the experiment they
preformed? What's the relationship between these two sets and the real
network world? Does the basic structure of SCC, IN, OUT remain stable over time?
Is there further and detailed research following its way by far?

####################################################

Date: Tue, 16 Apr 2002 23:01:01 -0500 (CDT)
From: Yu Hu <yuhu@cs.uchicago.edu>


On Powe_Law Relationships of the Internet Topology
[Michalis Faloutsos,Petros Faloutsos,Christos Faloutsos]

Main contributeion of the paper:
It showed to us a novel way to study such random internet topology, which
is power_laws. These laws will help us to learn more about the skewed
distributioncharaters of the internet "graph" and produce us some simple
parameters to analyze,compare and predict the internet topology. It opened
a new interesting field or method of the internet topology analyzing. 

Critique the main contribution:
Significance :  4( significant contribution)
Methodology:   Most conclusion, I think, can be deduced from these
experiments.However, some are not, such as: Based on different rank
exponents of the inter_domain graphs and intra_domain graph, author said
that rank exponent was a powerful metric for charaterizing famililies of
graphs. I suspect. 

Most interesting ideas:
In such chaos of internet,authors found such simple power_laws and using
few parameters can depict main properties of internet topology.

Weakness&Questions:
1.  I find in the paper authors sometimes drawed conclusion based on some
proofs or experiments' results, which I can't trust absolutely .
2.  In discussion about figure 7,8, authors got his conclusions. Does it
make sense to get conclusion from just 3 or 4 points ?
3.  Can author give some application of the eigenvalue?

Interesting Extension:
Try to find some basic charaters of internet( or other similar
system) that lead to the power_laws.

Comments:
After reading this paper,the first feeling of me is surprising,it is
really novel that so random internet has such simple power_laws. The
second feeling is encouraging, if all these are true, it will produce
powerful tools to put much deeper insight into the real topology of
internet.


Graph Structure in the web 
[Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar]

Main contributeion of the paper:
The paper gave us a mathematical graph view of internet topology. Authors
used graph theory methods to show us the graph structure in the web.

Critique the main contribution:
Significance :  2(incremental contribution)
Methodology:  I can't gotten much information about his methods. Maybe
authors should provide us much more details the datasets and analyzing
methods which will definitely make these results trustful. 

Most interesting ideas:
Using BFS on internet topology and showing us a pure graph struture. 

Weakness&Questions:
The paper didn't give me a very clear graphic structure of internet after
reading. Maybe authors should give more details on the analyzing whereas
not on someresults. Should the paper give some conclusions ?

Interesting Extension:
Can we just draw a graph of internet, after that, can we test on the graph
to get some prediction of the internet or its future?

Comments:
This is a mathematical computer science paper.The main idea of the paper
is pretty straighforward and right. Isupport the graphical method which
was used in this paper, however,the paper did't give me a very clear
conclusion of the "graph", maybe we can use the same method and
experiments to shine more light on topology of internet. So I just agree
this paper. 

###########################################################

From: "Matei Ripeanu" <matei@cs.uchicago.edu>
To: <anda@cs.uchicago.edu>
Subject: Paper evaluations - Wednesday April 17
Date: Tue, 16 Apr 2002 23:56:54 -0500


On Power Law Relationships on the Internet Topologies
Faloustos, Faloustos & Flaoustos

1. The Internet is not a completely random assembly: at both AS level and
router level the Internet is a power-law network.

Benefits from a better understanding of Internet topology: we can design (1)
protocols that take advantage of discovered topology and its derivate
properties, and (2) better simulators and topology generators.

2.a. Rate=3.5.  Provides credible evidence that the Internet is a power-law
network. This evidence had a significant impact on recent network topology
generators.

2.b. Rate=3.  Apart from gathering BGP data and plotting node degree
distributions (which is quite straightforward) I have some doubts.  See
point 4 below. Not sure the router level power-law still holds in 2002.

2.c. Like most "measurement" papers it would be nice if the paper tried to
give (and validate) and hypothesis that explains the data found.

3. Strongest ideas: There is some macroscopic order in the apparently random
Internet organization.  The Internet is (at both AS level and router level)
is a power law network. Also, the authors find that the eigenvalues of the
adjacency graph representing the topology are distributed according to a
power law too (they do not explore the significance of this fact further
though!).

4. Weaknesses:
	- it seems to me that distributions in Sect 4.1 and 4.2 imply one another!
(here is a paper that hints to this: "Zipf, Power-laws, and Pareto - a
ranking tutorial" by L. Adamic, http://www.hpl.hp.com/shl/papers/ranking/ )
	- a statistician would argue that, before doing regression analysis, they
should bin data in exponentially larger bins. (see same reference)
	- figures 7 and 8 have to few points to do regression analysis. Thus
conclusions in Sect 4.3 and on might look a bit far fetched.
	- they seem to call any distribution with and exponent a power-law: Zipf,
Pareto, proper power-laws, all go in here.

5. Questions to authors:
	- are the other network characteristics (neighborhood size distribution,
eigenvalues distribution) consequences of the power-law structure?
	- does these results hold today? if yes, what is the evolution in the power
constant for the power laws


6. Extensions:

Follow the evolution of the Internet at the AS level for a period of time
and try to come up with an evolutionary explanation for the observed
patterns. (Note: it is quite straightforward to collect data).

7. Comments to discuss in class:
	- one paper by Tangmunarunkit & all [1] that explains the emergent
power-law network structure as a result of the underlying AS size
distribution (as opposed to preferential attachment suggested by authors).
	- clean way to do data analysis?
	- I wish I have time to read [2] by the start of the class (I'm adding the
link here hopping that one of you might find time to do read it if I don't).

References:

[1] H. Tangmunarunkit, J. Doyle, R. Govindan, S. Jamin, S. Shenker, W.
Willinger, "Does AS Size Determine Degree in AS Topology?", To appear on
Computer Communication Review, October 2001.

[2] Q. Chen, H. Chang, R. Govindan, S. Jamin, S. Shenker, W. Willinger, The
the Origin of Power-Laws in Internet Topology Revisited, Proceedings of IEEE
Infocom, June 2002. (http://www.icsi.berkeley.edu/~ramesh/papers.html)