November 2008
NCDM receives SC|08 Conference
Bandwidth Challenge Award.
AUSTIN, Texas, Nov. 20 -- SC08 -- The National Center for Data Mining (NCDM) at
UIC and the Open Cloud Consortium were awarded the 2008 SC08 Bandwidth
Challenge award at SC08 today in Austin.
Their entry was titled "Towards Global Scale Cloud Computing: Using Sector and
Sphere on the Open Cloud Testbed" and was led by Dr. Yunhong Gu of the
University of Illinois at Chicago and Dr. Robert Grossman of the University of
Illinois at Chicago and Open Data Group.
Although cloud computing is common today, processing data by clouds today is
almost always done within a single datacenter due to the technical challenges
processing data across multiple datacenters. The team today demonstrated
technology for the first time that enables cloud computing to utilize high
performance networks and spread cloud computing across datacenters to create
wide area clouds. The technology that makes this possible is the open source
Sector storage cloud and Sphere compute cloud developed by the NCDM.
NCDM used the Open Cloud Testbed, which is a testbed managed by the Open Cloud
Consortium for this challenge. The Open Cloud Consortium develops standards for
computing within clouds and frameworks for interoperating between clouds.
"A whole new generation of cloud computing is now possible using the open
source Sector storage cloud and the Sphere computing cloud and standards
developed by the Open Cloud Consortium. For the first time, developing
applications that span multiple distributed clouds is now possible," according
to Robert Grossman.
According to Joe Mambretti, director of the International Center of Advanced
Internet Research at Northwestern University and co-director of the Open Cloud
Testbed, "These innovative technologies provide unique capabilities that will
enable new generations of applications based on extremely large scale data
streams."
During the Bandwidth Challenge at SC08, the team demonstrated three
applications that used the Sector/Sphere cloud. The application transported
bioinformatics data using Sector from the conference floor in Austin to
Kitakyushu in Japan at over 8 Gb/s.
The second application demonstrated was Creditstone, which is a benchmark for
financial services applications. The Sector/Sphere implementation of
Creditstone processed about 53.5 billion synthetic credit card transaction
records in less than 1 hour.
The third application was TeraSort, which sorted 1 terabyte of data within 30
minutes. The average data moving rate was about 4.8Gb/s in the Open Cloud
Testbed, with a peak speed reaching 10Gb/s.
One of the key achievements of the Sector and Sphere software is that it is
very easy to use. For example, the TeraSort code only requires about 50 lines
of C++ code. This is critical, as it allows researchers to use their time to
focus on research problems, rather than spending time dealing with distributed
programming.
According to Yunhong Gu, "Sphere is a new software system that supports
simplified distributed data processing application development. In contrast to
traditional distributed computing methods such as MPI, Sphere allows users to
write distributed applications with a few lines of code and without knowing the
details of the underlying hardware."
Source: HPC Wire
The Laboratory of Advanced Computing
(LAC) at the University of Illinois at
Chicago (UIC) was established in 1998 to serve as a resource for research,
standards development, and education for high performance and distributed data
mining and predictive modeling.
The NCDM is supported, in part, by the National Science Foundation, the
Chicago Bioinformatics Consortium, the Department of Defense, and the
University of Illinois at Chicago, as well as by other funding agenices
and NCDM's industrial partners.
Center's Recent projects:
- Teraflow Testbed - distributed infrastructure designed to use new 10 GB/s network protocols and data services.
- Sector - infrastructure software providing distributed data storage, access and analysis/processing functionality.
- Angle - network monitoring software to detect anomalous network events across multiple monitoring sites.
- SidGrid - social informatics data collection and collaborative analysis software utilizing web and grid services.
The Center focuses on three research areas:
- Scaling algorithms, applications and systems to massive data sets.
- Developing algorithms, applications, and systems for mining distributed data.
- Establishing standard languages, protocols, and services for data mining and predictive modeling.
The LAC is a co-founding member of the Data
Mining Group (DMG), which develops the
Predictive Model Markup
Language (PMML) and related standards.
Recent News and Awards page.
|