Optical Race
... esearchers collaborating over large distances are more interested
than ever in the possibility of real-time decision making, which
would allow geographically remote groups to view and work simultaneously
with the same large datasets or large-scale, high-resolution visualizations.
Research communities such as environmental engineers and oceanographers
-- the latter heavily involved in Navy-supported research -- have
specific needs for technology that will make responding to environmental
hazards and monitoring water supplies and other natural resources
more efficient.
The OptIPuter is a computing
paradigm in which dynamically controllable optical networks become
the system bus that connects cluster computers as if they were giant
peripherals in a planetary-scale computer. (The IP in OptIPuter
refers to the fact that it uses Internet Protocol as the standard
for data transmission.) Supported by National Science Foundation's
Information Technology Research (ITR) program, the OptIPuter aims
to deliver the middleware and end-user software that will allow
geoscientists and bioscientists to work with enormous data-sets
in real-time over thousands of miles of fiber-optic cable that are
part of an emerging Lambda Grid that connects sites like TRECC,
NCSA, UCSD, and EVL at UIC.
EVL is now partnering
with TRECC to deploy new visualization and other user interface
technologies at the TRECC facility in West Chicago. It's a collaboration
that's been going on for three years now and that began with the
installation of the Continuum at TRECC in 2001. "The Continuum
is really the prototype for OptIPuter collaboration environments,"
says Jason Leigh of EVL, who currently leads the project to make
TRECC an OptIPuter node. "In other words, they should be extremely
display-rich environments, with the ability to wallpaper a high-definition
video stream and high-resolution visualization content, and to be
able to work collaboratively with this data over distance."
The result of this experiment
was the Scalable Adaptive Graphics Environment (SAGE) -- the software
that will drive the "next generation" of the Continuum.
Imagine an entire room covered in thin displays (which Leigh predicts
will someday be cheap enough to be used as wallpaper) and driven
by an extremely high-speed network. "You're going to stop treating
information on the wall like you would on your regular desktop computer,"
says Leigh. "The traditional notion of
using a keyboard and a mouse doesn't quite work very well, because
the cursor is so small that it will disappear into the wall."
Instead, Leigh suggests, people will walk up to the wall and interact
with it as if it were simply an office wall -- but one as useful
and important as a computer desktop. "Think about how you organize
your office-some people will put up posters, some people will tape
up bits of paper with notes on them, people will have little corkboards
where they stick bits of notes and posters and images, potentially.
This is exactly the same thing, except that we're going to make
it digital and hence provide greater access to dynamic information.
People already take advantage of wall space for putting up information.
We're just making it digital so that it's even more flexible."
Leigh further envisions that as users move from one room to another,
all the information in that room will be able to move with them,
seamlessly.
The OptIPuter
Gets Real
Last week, the UCSD division
of the California Institute for Telecommunications and Information
Technology (Calit2) and the J. Craig Venter Institute announced
that they would collaborate to decipher the genetic code of the
world's marine microbiological communities. This project, the Community
Cyberinfrastructure for Advanced Marine Microbial Ecology Research
and Analysis (CAMERA), will use the OptIPuter model developed at
Calit2 as the architecture for its omputational resources.
Named for its use of
Optical networking, Internet Protocol, computer storage, processing
and visualization technologies, the OptIPuter is an infrastructure
that links computational resources over optical networks using the
IP communication mechanism. The OptIPuter's central architectural
element is optical networks, not computers. The goal of this architecture
is to enable researchers who are generating large volumes of data
to interactively visualize, analyze, and correlate their data from
distributed sites.
"What is exciting
about this is that it's taking both frontier science and combining
it with frontier cyberinfrastructure," said Smarr. Larry Smarr,
as one of the luminaries in the field, is well known for his contributions
to the information technology community, from his early involvement
in the original Mosaic web browser at NCSA to his current work as
the founding director of Calit2. David Kingsbury, the science program
officer at the Moore foundation, was well-aware of Smarr's work.
Beside basic scientific
discovery, there are several of potential applications for metagenomic
research. According to Smarr, there are a number of companies that
are already looking at marine microorganisms for new drugs, the
way they have with soil-based microorganisms. There are also exciting
biofuel applications that are being considered, for example the
production of hydrogen and ethanol as fuel sources from microbial
metabolism.
Smarr also projects how
the technology can be applied directly to other microbial ecosystems.
For example, the microorganisms inside of the large intestines were
recently shotgun sequenced by Stanford researchers. Soil microorganisms,
the source of many drugs, such as penicillin, are another likely
target for metagenomics. Even airborne
dust particles can be biologically active and are currently being
studied in relation to the mold problem caused by the aftermath
of Hurricane Katrina.
The OptIPuter model is
based on the ability of optical networks to move data around at
speeds of tens of gigabits per second over dedicated lambdas. Significantly,
the increases in optical network bandwidth and storage capacity
are outstripping the increases in CPU performance. As a result,
"Moore's Law" is not driving information technology the
way it used to (ironic when you consider that Gordon Moore, the
originator of "Moore's Law," is now funding this project
through his Foundation).
The OptIPuter exploits
the enormous bandwidth of fiber optic networks to link distributed
computer and storage resources. With the recent expansion of National
LambdaRail as the optical backbone for cross-country connectivity,
Smarr believes we're entering a critical stage for technological
change.
"This is a one-in-twenty-year
transition point," said Smarr, "going back to 1985, when
the NSF built the first backbone for the shared Internet. Now National
LambdaRail has built the first backbone for the unshared Internet.
At present, there are about two dozen state and regional optical
networks that are interconnecting to National LambdaRail. The campuses
are beginning to put fiber optics into their actual laboratories,
and connecting these to the state and regional optical networks
which are then connected to National LambdaRail."
"So that was the
fundamental insight that led us to work on these optical networks.
It wasn't that optical networks were cool and we were looking for
something to do with them. It was that the scientific community
had decided on Linux clusters as their standard and they're natural
need for a wide area network was clearly in the gigabits and tens
of gigabits per second range. So we looked around for a technology
that could provide this and found that the telecom industry had
evolved to the point where the natural data flow on their individual
lambdas was 10 gigabits per second."
The Science Server
As part of the CAMERA project, Calit2
will partner with UCSD's SDSC to develop the science data server
complex, which couples the Calit2 and SDSC middleware, compute,
and storage capabilities with the TeraGrid computing facility in
a Service Oriented Architecture. This will enable computing resources
to be applied to a range of tools to tackle the computationally
intense questions derived from the metagenomic data collection.
"This is the first science data
server that has been architected to direct-connect to your local
cluster through the National LambdaRail. What we've done with this
server is make it the first TeraGrid appliance. In other words,
we're linking directly into the TeraGrid lambdas from our science
server. So as a user, when you connect to the science server, it
now appears to be just an extension of your local cluster. Over
the next few years the TeraGrid will expand to tens of thousand
of processors, so you'll get orders of magnitude increases in power
by plugging into the TeraGrid. It should all appear as if it's in
your laboratory. And that's the vision!"
|