Date: Tue, 30 Apr 2002 22:26:02 -0500 (CDT)
From: Rahul Santhanam <rahul@cs.uchicago.edu>


Ratnasamy-Francis-Handley-Karp-Shenker:

1. The paper introduces the notion of a content-addressable network, which
is a scalable, fault-tolerant and low-latency distributed scheme for
indexing keys to values.

2. a. 4 - significant contribution. The paper introduces and analyzes an
important new idea.

   b. The simulations are well-designed and the conclusions are valid. The
authors could perhaps have run the simulation for a wider range of design
choices, to determine which the optimal combination is likely to be in
practice.

   c. The most important limitation of the approach is the latency grows
as a power of the number of nodes rather than the logarithm.

3. (1) The paper is well-written. (2) The authors consider a variety of
design choices. (3) The experimental results are impressive

4. (1) The authors do not include any analytical justification of their
claim that latency is low and that partitioning of the coordinate space is
typically more or less uniform. (2) The discussion of landmark orderings
is naive. (3) The authors do not analyze how caching and replication
perform in combination with their design. (4) There is no discussion of
the process by which keys enter and leave the system.

6. An interesting extension would be to design a CAN such that the overlay
graph is expander-like in that each node has a constant-size neighborhood
and the latency grows logarithmically with the number of nodes. Of course,
it is important that the CAN routing and CAN maintenance mechanisms not be
too complex.

Ratnasamy-Handley-Karp-Shenker:

1. The paper describes a simple, scalable scheme to infer
network proximity information and apply this scheme to the problems of
topologically-aware overlay construction and server selection.

2. a. 3 - modest contribution. The idea is interesting but seems to have
limited applicability.

   b. The authors perform measurements on many different systems, all of
which support their claims for the efficacy of their binning scheme.

3. (1) The simplicity of the binning scheme is compelling. Also, the
measurements used for determining a node's bin are non-intrusive.(2) The
results showing that binning performs almost as well as nearest neighbor
clustering are impressive.
   
4. (1) As the authors themselves remark, the system does not scale when
one considers load on the landmark nodes. Perhaps there is a feasible
scheme that is even more distributed? (2) The authors do not sufficiently
address the question of where the landmarks are positioned. This might
have a significant impact on the scheme. (3) The authors do not discuss
the question of how the number of bins affects performance. 

6. Theoretically analyze the binning scheme used.


Rahul.


Date: Tue, 30 Apr 2002 23:11:53 -0500 (CDT)
From: Xinghua Shi <shi@cs.uchicago.edu>

Paper1: Chord: A Scalable Peer-to-Peer Lookup Service for internet 
Application

Contribution:  The paper provides distributed lookup protocol called 
Chord that addresses the problem to efficiently locate the node that 
stores a particular data item. It's striking contribution is that Chord is 
a scalable protocol for lookup in a dynamic peer-to-peer system with 
frequent node arrivals and departures. 


Rate:
a. 4 in significance: This paper has significant contribution because 
Chord it presents is really a nice protocol. It's valuable in
peer-to-peer, large scale distributed applications. 

b. 5 in convincing of methodology: The argument is well organized and 
the ideas and applications are clearly explained. The psuedocode and proof
of thermos along with the simulation and results analysis offers a sound
base for the arguments. 

The strongest and/or most interesting ideas in the paper:
a. Chord improves the scalability of consistent hashing by avoiding the
requirement that every node know about every other node. 
b.The finger scheme.
c.With high probability, the number of nodes that must be contacted to
find a successor in an N-node network is O(logN).
d. The simulator for Chord protocol is implemented in an iterative or
recursive way that is plausible and feasible in engineering.

The most striking weaknesses in the paper:
a.The protocol is robust in most of the time, but it's vulnerable to
malicious attack by hacks or others. 

Questions:
a. Comparisons of the consistenct hashing in this paper with that in 
previous paper.
b. How Chord might work under malicious attack? How to deal with errors in
finger tables?

Paper2: A Measurement Study of Peer-to-peer File Sharing Systems

Contribution:  The paper offers a detailed measurement study of two
popular peer-to-peer file sharing systems, Gnutella and Napster. It seeks
to precisely characterize the population of end-user hosts that
participate in these two systems. All the study and results present a
close look at the two systems and thus give a practical understanding of
the peer-to-peer system in general.

Rate:
a. 4 in significance: This paper has significant contribution because it
gives a detailed study of two typical peer-to-peer systems in real
world. There come out some interesting observations through this thorough
study.

b. 4 in convincing of methodology: The authors explain the issues clearly
and have a good organization of the whole paper. The results are from
reasonable datasets and analyses, though they can do further from more
representative datasets.

The strongest and/or most interesting ideas in the paper:
a. The fact that in spite of claims that every peer is both a server and a
client, Gnutella has an inherently large percentage of free-riders. 

The most striking weaknesses in the paper:
a. They claimed they successfully captured a significant number of data
points for them to believe that their results and conclusions are
representative for the entire Napster population. But they didn't mention
why we can believe them in detail.

Questions:
a.How about the results if we get different datasets, that is to say, how
much effect the dataset has on the results?
 b. Is there improvement in the drawbacks of Gnutella and Napster in the
similar P2P systems so far?


From: Ivona Bezakova <ivona@cs.uchicago.edu>
To: Anda Iamnitchi <anda@cs.uchicago.edu>

A Scalable Content-Addressable Network [RFHKS01]

1. State the main contribution of the paper: A new concept of distributed
(decentralized) file-sharing system is proposed: Nodes correspond to
subsections of d-dimensional space (mapping is random) and they keep track
of their neighbors (neighboring subsections). Keys are hashed to
appropriate nodes and Manhattan-like metric is used for routing.
Subsections can be merged or divided to accommodate a new or failed node.
Many improvements are suggested, e.g. to try to minimize latency by
grouping nodes geographically close to nearby subsections; replication
strategy (use several hash functions); minimize failure errors by using
realities (multiple coordinate spaces), etc.

2. Critique the main contribution.  
a. Rate the significance: 3-4. Nice concept, although I don't think that
   it is a new idea. If yes, then 4. (Please don't compare with my
   previous ratings, I've become harsher since the start of the quarter.)
b. Rate how convincing: 3-4. See (4)

3. What are the three strongest and/or most interesting ideas in the
paper?
- Simple idea, easy to implement.
- Design improvements (and their total number).

4. What are the three most striking weaknesses in the paper?
- Better CAN routing metrics: Is delivery of the message always
  guaranteed?
- The comparison of design improvements: The authors always fix other
  parameters (or don't assume other improvements) and play with just one,
  finding the best value. For example, increasing realities reduces path
  lengths but what happens if we use e.g. overloading coordinate zones at
  the same time?

5. Name three questions that you would like to ask the authors?
- Overloading coordinate zones: What happens if a neighbor I am pointing
  to fails? (How do I find its peer?)

6. Detail an interesting extension to the work not mentioned in the future
work section.
- What are the theoretical guarantees (e.g. deviation)?
- How to compare design improvements?

7. Optional comments on the paper that you'd like to see discussed in
class.
- Are the numbers used for simulations reasonable / corresponding to real
  life?
- Comparison to Chord: Which would you prefer?


Topologically-Aware Overlay Construction and Server Selection [RHKS02]

1. State the main contribution of the paper: This is a follow-up paper to
[RFHKS01]. The authors detail possible uses of binning method of finding
geographically close neighbors. They argue that this is a local,
distributed, and well-scalable approach and it could be used both for
overlay network construction and server selection.

2. Critique the main contribution.  
a. Rate the significance: 2-3. The binning idea was presented in 
   [RFHKS01]. We are provided with more experiments supporting the idea
   and the server selection application is a new contribution.
b. Rate how convincing: 3-4. I did not find the paper very interesting,
   after reading the first one. 

3. What are the three strongest and/or most interesting ideas in the
paper?
- Application of binning idea to server selection problem.

4. What are the three most striking weaknesses in the paper? - See (5).

5. Name three questions that you would like to ask the authors?
- I am able to compute my bin number by myself but how do I know who else
is in my bin?


Date: Wed, 01 May 2002 00:51:35 -0500
From: Oleg Pashko <pashko@cs.uchicago.edu>
Topologically-Aware Overlay Construction and Server Selection.

1.A Binning scheme is proposed such that close nodes are partitioned
into local
beens (nodes within one been are assumed to be close to one another in
terms of
topological latency. The scheme is applied to improve strategy for
overlay network
construction and server selection.
2a.3. Interesting approach to an important problem - mapping the
existing network onto
the underlying IP-level topology.
2b.Experiments seem to be fair, but not as numerous and extensive as I
would expect
3.To me the most interesting thing in the paper was the posed problem
itself - an
attempt to reflect the topology of nodes in distributed systems..
4-7. It is not clear if the gathered topological information will be
very reliable in
a real dynamic peer-to-peer applications. It would be interesting to
quantitatively
compare this approach with other attempts to incorporate the knowledge
of the underlying
topology into the network construction/server selection scheme.
The described binning is not the only possible approach to the node
clustering. Perhaps the
problem can be solved in a more general setting - defining some metric/
distance between the
nodes (witch does not have to be only a physical distance between the
nodes) and applying
some standard classification algorithms (nearest neighbor classifier,
for example).

A Scalable Content-Addressable Network.

1.The traditional hash tables scheme (now called Content-Addressable
Network) is extended to
large-scale distributed systems in hope to improve scalability (in
particular indexing) of
P2P systems.
2a-7. 2.5. They consider an important problem of P2P scalability and
apply an efficient tool -
hash tables. Yet, the idea does not seem to be very new. Also allocating
nodes to zones (bins)
at random does not make much sense, for it does not improve the global
overlay network structure.
Topologically-sensitive construction of the CAN overlay network maybe a
more interesting direction.
The security of the CAN system to the denial of service attacks is an
open problem, too.
Testing this approach on a real file sharing applications would provide
more insight into the
effectiveness of the CAN systems.
Can binning and hash table approaches be combined into a more efficient
network which topology
is more congruent with the IP structure?