DSL Workshop 2006 -- DSLW 2006

 

Distributed Systems Laboratory

Department of Computer Science

University of Chicago

 

June 2nd, 2006 from 12PM - 5PM in RI480

 

The DSL Workshop is a venue for the DSL groups to present their research work.  The goal is to provide an opportunity for people in DSL at University of Chicago, Department of Computer Science (DSL-UC) and DSL at Argonne National Labs, Math and Computer Science Division (DSL-ANL) to learn about what people are doing within the two groups.

The DSLW 2006 will be held in the Research Institute (RI), Room 480, 5640 South Ellis Ave., Chicago, IL, 60637.  For more information about the location of the Research Institute building, please click here

The tentative schedule of events is:

Time Presenter

Description

12:00PM - 12:45PM

Lunch

12:45PM - 1:00PM

Ian Foster

University of Chicago & Argonne National Laboratory

Opening Remarks

1:00PM - 1:30PM

Charlie Catlett

Argonne National Laboratory

Opportunities on the TeraGrid

1:30PM - 2:00PM

Matei Ripeanu

University of British Columbia

Distributed Snapshots with Virtual Machines

2:00PM - 2:30PM

Chuang Liu

University of Chicago

Computing Clusters in a Large Resource Pool

Abstract: Applications such as parallel computing, online games, and content distribution networks need to run on a set of resources with particular network connection characteristics to get good performance. We present an efficient algorithm to find a set of resources with the property that the network latency between any pair of resources is less (or more) than a given value in the Internet.  We evaluate this method in a large distributed Internet environment, and show that our method can improve the query response time remarkably over current methods. We also show this method is scalable to handle a large number of dynamic internet resources.

2:30PM - 2:45PM

Coffee Break

2:45PM - 3:15PM

Ioan Raicu

University of Chicago

Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets

Abstract: The astronomy community has an abundance of imaging datasets at its disposal which are essentially the “crown jewels” for the astronomy community. However, these astronomy datasets are generally terabytes in size and contain hundreds of millions of objects separated into millions of files—factors that make many analyses impractical to perform on small computers. The key question we answer in this paper is: “How can we leverage Grid resources to make the analysis of large astronomy datasets a reality for the astronomy community?” Our answer is “AstroPortal,” a gateway to grid resources tailored for the astronomy community. To address this question, we have developed a Web Services-based system, AstroPortal, that uses grid computing to federate large computing and storage resources for dynamic analysis of large datasets. Building on the Globus Toolkit 4, we have built an AstroPortal prototype and implemented a first analysis, “stacking,” that sums multiple regions of the sky, a function that can help both identify variable sources and detect faint objects. We have deployed AstroPortal on the TeraGrid distributed infrastructure and applied the stacking function to the Sloan Digital Sky Survey (SDSS), DR4, which comprises about 300 million objects dispersed over 1.3 million files, a total of 3 terabytes of compressed data, with promising results. AstroPortal gives the astronomy community a new tool to advance their research and to open new doors to opportunities never before possible on such a large scale. 

3:15PM - 3:45PM

Suman Nadella

Argonne National Laboratory

SPRUCE:  A System for Supporting Urgent Computation on HPC Grids

Abstract: High-performance modeling and simulation are playing a driving role in decision making and prediction. For time-critical emergency support applications such as severe weather prediction, flood modeling, and influenza modeling, late results can be useless. A specialized infrastructure is needed to provide computing resources quickly, automatically, and reliably. SPRUCE is a system to support urgent or event-driven computing on both traditional supercomputers and distributed Grids. Scientists are provided with transferable "right-of-way" tokens with varying urgency levels. During an emergency, a token has to be activated at the SPRUCE portal, and jobs can then request urgent access. Local policies dictate the response, which may include providing "next-to-run" status or immediately preempting other jobs. Additional components under development include a periodic testing mechanism of applications in "warm-standby" mode ensuring readiness and an automated "advisor" that helps find the best resource to submit based on deadline, queue status, site policy, and warm-standby history.

3:45PM - 4:15PM

Borja Sotomayor-Basilio

University of Chicago

Resource Management for Virtual Clusters

Abstract: Resource providers and consumers on a Grid have conflicting requirements that are only partially reconciled with current Grid tooling. Virtual workspaces, VM-based execution environments that can be dynamically deployed on Grid resources, can help to eliminate these conflicts altogether. In this talk, we describe our current efforts to design and implement scheduling software capable of deploying aggregate workspaces (or "virtual clusters") in physical clusters on a Grid.

4:15PM - 4:45PM

Yong Zhao

University of Chicago

Virtual Data Language - A Typed Workflow Notation for Diverse Structured Scientific Data

Abstract: The description, composition, and execution of even logically simple scientific workflows are often complicated by the need to deal with messy issues like heterogeneous storage formats and ad-hoc file system structures. We show how these difficulties can be overcome via a typed, compositional workflow notation within which issues of physical representation are cleanly separated from logical typing, and by the implementation of this notation within the context of a powerful runtime system that supports distributed execution. The resulting notation and system are capable both of expressing complex workflows in a simple, compact form, and of enacting those workflows in distributed environments. We apply our technique to cognitive neuroscience workflows that analyze functional MRI image data, and demonstrate significant reductions in code size relative to other approaches.

4:45PM - 5:00PM

Ian Foster

University of Chicago & Argonne National Laboratory

Closing Remarks

For more information, please contact Ioan Raicu at iraicu@cs.uchicago.edu.

 

Webmaster Ioan Raicu: iraicu@cs.uchicago.edu 
Last modified: June 05, 2006