DSL Workshop - DSLW 2006

The DSLW 2006 will be held in the Research Institute (RI), Room 480, 5640 South Ellis Ave., Chicago, IL, 60637. For more information about the location of the Research Institute building, please click here.

Time	Presenter	Description
12:00PM - 12:45PM	Lunch
12:45PM - 1:00PM	Ian Foster University of Chicago & Argonne National Laboratory	Opening Remarks
1:00PM - 1:30PM	Charlie Catlett Argonne National Laboratory	Opportunities on the TeraGrid
1:30PM - 2:00PM	Matei Ripeanu University of British Columbia	Distributed Snapshots with Virtual Machines
2:00PM - 2:30PM	Chuang Liu University of Chicago	Computing Clusters in a Large Resource Pool Abstract: Applications such as parallel computing, online games, and content distribution networks need to run on a set of resources with particular network connection characteristics to get good performance. We present an efficient algorithm to find a set of resources with the property that the network latency between any pair of resources is less (or more) than a given value in the Internet. We evaluate this method in a large distributed Internet environment, and show that our method can improve the query response time remarkably over current methods. We also show this method is scalable to handle a large number of dynamic internet resources.
2:30PM - 2:45PM	Coffee Break
2:45PM - 3:15PM	Ioan Raicu University of Chicago	Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets Abstract: The astronomy community has an abundance of imaging datasets at its disposal which are essentially the “crown jewels” for the astronomy community. However, these astronomy datasets are generally terabytes in size and contain hundreds of millions of objects separated into millions of files—factors that make many analyses impractical to perform on small computers. The key question we answer in this paper is: “How can we leverage Grid resources to make the analysis of large astronomy datasets a reality for the astronomy community?” Our answer is “AstroPortal,” a gateway to grid resources tailored for the astronomy community. To address this question, we have developed a Web Services-based system, AstroPortal, that uses grid computing to federate large computing and storage resources for dynamic analysis of large datasets. Building on the Globus Toolkit 4, we have built an AstroPortal prototype and implemented a first analysis, “stacking,” that sums multiple regions of the sky, a function that can help both identify variable sources and detect faint objects. We have deployed AstroPortal on the TeraGrid distributed infrastructure and applied the stacking function to the Sloan Digital Sky Survey (SDSS), DR4, which comprises about 300 million objects dispersed over 1.3 million files, a total of 3 terabytes of compressed data, with promising results. AstroPortal gives the astronomy community a new tool to advance their research and to open new doors to opportunities never before possible on such a large scale.
3:15PM - 3:45PM	Suman Nadella Argonne National Laboratory	SPRUCE: A System for Supporting Urgent Computation on HPC Grids Abstract: High-performance modeling and simulation are playing a driving role in decision making and prediction. For time-critical emergency support applications such as severe weather prediction, flood modeling, and influenza modeling, late results can be useless. A specialized infrastructure is needed to provide computing resources quickly, automatically, and reliably. SPRUCE is a system to support urgent or event-driven computing on both traditional supercomputers and distributed Grids. Scientists are provided with transferable "right-of-way" tokens with varying urgency levels. During an emergency, a token has to be activated at the SPRUCE portal, and jobs can then request urgent access. Local policies dictate the response, which may include providing "next-to-run" status or immediately preempting other jobs. Additional components under development include a periodic testing mechanism of applications in "warm-standby" mode ensuring readiness and an automated "advisor" that helps find the best resource to submit based on deadline, queue status, site policy, and warm-standby history.
3:45PM - 4:15PM	Borja Sotomayor-Basilio University of Chicago	Resource Management for Virtual Clusters Abstract: Resource providers and consumers on a Grid have conflicting requirements that are only partially reconciled with current Grid tooling. Virtual workspaces, VM-based execution environments that can be dynamically deployed on Grid resources, can help to eliminate these conflicts altogether. In this talk, we describe our current efforts to design and implement scheduling software capable of deploying aggregate workspaces (or "virtual clusters") in physical clusters on a Grid.
4:15PM - 4:45PM	Yong Zhao University of Chicago	Virtual Data Language - A Typed Workflow Notation for Diverse Structured Scientific Data Abstract: The description, composition, and execution of even logically simple scientific workflows are often complicated by the need to deal with messy issues like heterogeneous storage formats and ad-hoc file system structures. We show how these difficulties can be overcome via a typed, compositional workflow notation within which issues of physical representation are cleanly separated from logical typing, and by the implementation of this notation within the context of a powerful runtime system that supports distributed execution. The resulting notation and system are capable both of expressing complex workflows in a simple, compact form, and of enacting those workflows in distributed environments. We apply our technique to cognitive neuroscience workflows that analyze functional MRI image data, and demonstrate significant reductions in code size relative to other approaches.
4:45PM - 5:00PM	Ian Foster University of Chicago & Argonne National Laboratory	Closing Remarks

Webmaster Ioan Raicu: iraicu@cs.uchicago.edu

Last modified: June 05, 2006