Date: Tue, 23 Apr 2002 22:36:50 -0500 (CDT)
From: Yu Hu <yuhu@cs.uchicago.edu>

YU HU
DNS Performance and the Effectiveness of Caching 
[Jaeyeon Jung, Emil Sit,Hari Balakrishnan, and Robert Morris]

Main contributeion of the paper:
This paper presents a detailed analysis of traces of DNS and associated
TCP traffic collected on the internet links. DNS is a very important part
which will impact user application much and needs more analysis than
before. The paper showed us a clear DNS and detail analysis on client
sides. Main contribution of thepaper, I think, is leading us into a deeper
and detailed insight of the DNS andits caching.

Critique the main contribution:
Significance :  4( significant contribution) 
Methodology: The method which is used in this paper is novel comparing
with former papers. Authors not only collected the DNS traffics, but also
collected thecorresponding TCP traffics, this made the experients' results
more convincing and useful. And the datasets were collected from two
places: MIT LCS and KAIST that were different connection structures. So I
support the method used by authors and trust the results result from these
experiments.

The most important limitation of the approach:
I just want to know whether the datasets that authors collected can
represent the real DNS traffics? 

Most interesting ideas:
DNS performance is not mainly depended on caching. And lowing A record TTL
willnot impact DNS performance much. DNS is more depend on partition the
name spaceand avoid overloading any single name sever in the internet.

Weakness&Questions:
1.Can authors give more detailed and proved relationship between DNS
packets and TCP packets?
2.Are the datasets collected by authors representives of the real DNS
traffics?3. I can find a lot power-law in these distributions, is there
some fundenmental rule underlying them ?

Interesting Extension:
Can we improve DNS performance by using optimal mapping algorithms or
using optimal group rules on share cache ?

Comments:
This is a good paper. After reading it, I am clearer about the DNS and its
caching. Authors' argument is reasonable and their consideration is
comprehensive. The paper will produce people a detailed and deep insight
into the important DNS.


Web Caching with Consistent Hashing
[David karger, Alex Sherman and etc]

Main contributeion of the paper:
The paper introduced a new idea to implement the web caching, namely
consistenthashing. And by showing us their implemented system, authors
argued that the idea was practical for the real web. 

Critique the main contribution:
Significance :  3 (modest contribution)
Methodology:  The paper didn't show us much methods, it just set up a
system and a test and got some test results, from which got some
conclusions. I prefer to learn more details about the experiments in spite
the experiment's method waspretty trivial. 

Most interesting ideas:
In this paper, authors used DNS in another way,which proved reasonable in
the experiments.

Weakness&Questions:
I trust that it is helpful to use consistent hashing in web
caching. However,author seemed not to show us a very clear view of the new
idea and a detailed view of their implemented system. Another question I
want to ask , have author consider the issue of cost when they design
their system ?

Interesting Extension:
Can we set up a mechanism to decide which papes are hot or not ? Does this
mechanish exit ?

Comments:
This paper showed us a good idea--consistent hashing. In spite that
authors didn't analyze in much details, we can benefit from it a lot. It
maybe lead us to design another system to improve web caching performance
using the intuition and ideas that were given in this paper. Anyway, I
support this paper.


Date: Tue, 23 Apr 2002 22:56:46 -0500 (CDT)
From: Ivona Bezakova <ivona@cs.uchicago.edu>

DNS Performance and the Effectiveness of Caching
[Jung, Sit, Balakrishnan, Morris]

1. State the main contribution of the paper: As the title suggests, the
paper discusses DNS performance (i.e. number of requests sent, percentage
of errors and no response obtained, number of "hops" - recursive or
iterative calls to other DNS needed, etc) and the usefulness of caching in
this process. The study is based on three sets of data obtained at two
academic locations in 2000 and 2001. The percentage of errors is
surprisingly high - above 20%, authors give some reasons for this
behavior, including mistakes in mappings and occurrence of DNS-loops. The
unanswered messages contribute overwhelmingly to the total traffic by
sending the requests several times. According to the datasets the number
of retrials could be significantly reduced and the success rate of finding
the address would be the same. The cache-study suggests that using cache
is helpful (only) for the most popular sites and for these using a small
TTL (order of minutes) suffices.

2. Critique the main contribution.  
a. Rate the significance: 4. I am surprised that very few experiments have
been performed in the area. Every new experiment shedding some light on
the problem is significant.
b. Rate how convincing: 4. I like the self-critique that authors use,
presenting not only results but pointing out possible problems or
misinterpretations.
c. What is the most important limitation of the approach? Study based on
only 2 sites, both of them being academic. The datasets for these sites
cannot be really compared since the servers/routers use different
strategies.

3. What are the three strongest and/or most interesting ideas in the
paper?

- A detailed study of the subject, authors are aware of strengths as well
as weaknesses of their approach.
- Practical suggestions for DNS or cache behavior - e.g. not caching
"off-stream" sites, correcting DNS routers that don't implement negative
caching, etc.
- Results/conclusions for varying TTL times.

4. What are the three most striking weaknesses in the paper?
- Authors realize that their first MIT dataset was not as precise as the
other one because they limited the packet size. Why didn't they recollect
the data when they found out that memory is not a constraint?
- For the simulation algorithm why did they decide to use groups of size
s? Wouldn't it better correspond to reality if the sizes were drawn
according to some (power-law?) distribution?
- How do they deal with websites that reload themselves automatically? It
is not the user's query, and these pages are likely to be in the cache.
Don't they affect the statistics too much (as being likely quite popular
sites, e.g. cnn.com)?

5. Name three questions that you would like to ask the authors? - See 4.

6. Detail an interesting extension to the work not mentioned in the future
work section. - Wouldn't the results for commercial domains be very
different? In academia it is expected that the users browse approximately
the same sites (within .edu plus a few others) whereas for clients of AOL
I would expect the interests to be much more diverse.

7. Optional comments on the paper that you'd like to see discussed in
class. See 4. and 6.


Web Caching with Consistent Hashing
[Karger, Sherman, Berkheimer, Bogstad, Dhanidina, Iwamoto, Kim, Matkins,
Yerushalmi]

1. State the main contribution of the paper: Experimental study of
theoretical model of consistent hashing (Karger et al, STOC '97) used as a
web-caching strategy. Some comparisons are being made to other caching
methods, e.g Common Mode and the consistent hashing seems to be better wrt
both latency and miss rate.

2. Critique the main contribution.  
a. Rate the significance: 2-3 - successful (? - read below) practical
   verification of theoretical result
b. Rate how convincing: 2-3 - it is not clear how the numbers for the test
   (e.g 1000 names, three proxy servers) were picked and whether these
   numbers are statistically significant.
c. What is the most important limitation of the approach? 

3. What are the three strongest and/or most interesting ideas in the
paper?
- Verifying theoretical result in practice.
- Use of DNS to bypass the browser's setting problem.
- Stating the original FOCS'97 results in rigorous form as well as
intuitive explanation.

4. What are the three most striking weaknesses in the paper?
- The paper mentions CARP but no comparison to this protocol is made.
- What is c in their experiment? (See Thm.) As I understand it, there are
three proxy caches (geographical) and then for each there are a few
(c) sub-caches.
- I think that the amount of data analyzed is not sufficient for any
reliable conclusions. (Although I don't have any statistical background
and therefore a good support for this opinion.)

5. Name three questions that you would like to ask the authors? 

6. Detail an interesting extension to the work not mentioned in the future
work section. - Compare to CARP or, even more interestingly, try to
theoretically characterize the features that make them different and see
which one is better.

7. Optional comments on the paper that you'd like to see discussed in
class. - The authors of this paper assume finite caches, papers discussed
earlier assume infinite caches. How do these views fit together? Which one
is better in which context?


Date: Tue, 23 Apr 2002 23:11:52 -0500 (CDT)
From: Rahul Santhanam <rahul@cs.uchicago.edu>

Jung-Sit-Balakrishnan-Morris:

1. The paper analyzes DNS traffic on the Internet and assess the impact
of caching on this traffic. They measure various parameters of actual DNS
traffic, such as latency, failures and number of retransmissions. They
then use trace-driven simulations to assess the impact of cache sharing
and choice of TTL on caching efficiency.
   Some of the conclusions reached in the paper:
   (1) Caching NS-records substantially reduces the DNS lookup latency.  
   (2) Because of the Zipf-like distribution of domain name popularity,
cache sharing is highly beneficial.
   (3) Lower TTLs than customary will work for A-records but not for
NS-records. 

2. a. 4 - significant contribution
 
   b. The methodology is convincing. The authors go to great detail in
their description of the experiments and of the sources of bias. The
experiments are well-designed.

3. The separate analysis of A-records and NS-records is interesting, and
is justified by the fact that different conclusions are drawn.

4. The section on negative caching is weak - the experiment is
unenlightening; the authors provide no alternative explanation for the
large number of NXDOMAIN responses.

Karger-Sherman-Berkheimer et al.:

1. The paper describes the implementation of a new web caching strategy
based on consistent hashing, as well as experiments to support the claim
that it improves performace. The consistent hashing approach is taken
from a previous theoretical paper - the authors' main contribution is an
implemetation method that uses the Domain Name System in a significant
way.

2. a.  3 - modest contribution. There are no new ideas in the paper.

   b.  The experiments would be more convinving if more caches and clients
are used, approximating a real-life situation. Also, only cache miss rate
is considered; perhaps, latency should also be a factor(especially since
DNS is used.)

3. For once, an interesting theoretical idea is also shown to be useful in
practice!

4. (1) The paper is not well-written. (2) The theoretical model does not
take into account the fact that web pages are requested more or less
according to Zipf's law  (3)In the early part of the paper, the authors do
not mention the fact that they are using DNS in the implementation;
moreover, they are defensive about this strategy throughout the paper. The
authors state that DNS resolution did not affect the performance of their
system, but give no evidence.

6. An interesting extension would be to actually modify a browser to
support consistent hashing, so that we can get a better idea of the
advantages of this technique.

Rahul.

--------------------------------------------------------------------------------

 Rahul Santhanam,                           Phone:(773)324-2583
 1369, E.Hyde Park Blvd.,                   E-mail:rahul@cs.uchicago.edu
 Apt. 905,
 Chicago,Illinois - 60615.

-------------------------------------------------------------------------------