San Diego, CA, March 5, 2009 — Exploring the big scientific "what if's" requires equal measures computational power and collaborative passion. And in a world where scientific research knows no geographic boundaries, universities in California and Australia are proving that distance is no longer a deterrent to the development and deployment of cyberinfrastructure and software tools — no matter how many tens of thousands of miles separate the researchers.
The University of California, San Diego and Australia's Monash University have a history of working together in the field of grid computing, at least since the founding of the Pacific Rim Applications and Grid Middleware Assembly (PRAGMA) in 2002. At the center of that collaboration: David Abramson, a computer scientist at Monash University and director of Monash's eScience and Grid Engineering Lab.
Abramson recently visited San Diego to collaborate with colleagues at the California Institute for Telecommunications and Information Technology (Calit2) and to give the first of the San Diego Supercomputer Center's new Cyberinfrastructure Seminars. He put the collaboration — and especially the universities' sharing of each other's software tools, Kepler and Nimrod, and OptIPortal tiled display walls — into perspective. "Nimrod and Kepler are tools that help researchers do scientific modeling experiments, some of them on a really large-scale," Abramson said, adding that OptIPortals play a similar role in enabling collaboration itself on data-intensive science.
Kepler is an open-source scientific workflow system for distributed computing. It was developed by a collaborative effort based on the Ptolemy software that emerged from UC Berkeley. Nimrod was developed by Abramson's team in Australia. The OptIPortal display walls were a product of the National Science Foundation-funded OptIPuter project, led by Larry Smarr, director of Calit2.
"Kepler, Nimrod and OptIPortals are the latest cyberinfrastructure building blocks to be added to the OptIPlanet Collaboratory, enabling long-distance, interactive collaborative research on complex scientific datasets," says Smarr. "They are beginning to be used routinely in tandem by scientists in both countries, allowing them to pursue joint research in green information technology, bio-engineering, neuroscience, microbial ecology and numerous other fields."
The OptIPuter, a partnership of UC San Diego and the University of Illinois at Chicago, is a tightly-integrated cluster of computational, storage and visualization resources, linked over dedicated optical networks that can be regional, national or international in scale.
In his talk, Abramson explained why cyberinfrastructure tools are needed. "[SDSC] astrophysicist Mike Norman, for example, does astrophysical simulations at the SDSC to try to work out what makes the universe tick," he said. "Obviously, he can't experiment by doing the real thing, so he has to rely on computational science, and he has to collaborate with all the different people who can to contribute different pieces of the jigsaw puzzle.
"There's lots of computational science like that going on — people who want to do large scientific experimentation, but have to put together different bits of information from research the world over," he continued. "They need to stitch up all of those bits into workflows, and that's what Kepler does. For us to work with Kepler was a no-brainer, because there's a whole stack of things it does, and if I don't have to do those things myself, our research funding goes further."
Conversely, scientists at UC San Diego now routinely use Abramson's Nimrod, a tool for integrating heterogeneous, distributed resources to perform single computations. The Australian researcher calls Nimrod "the spreadsheet of the computational world."
"It's a family of tools that helps you do 'what if' analysis with computational models, so you can do really heavy computational experiments, like 'What if the universe evolved under these parameters?' or 'What would happen if I chopped down all of these trees?' or 'What if we switched all cars over from gasoline to ethanol?'"
Once scientists have gathered and processed their experimental data using Kepler and Nimrod, they need to be able to look at it — something that, in the past, proved to be a sticking point for laboratories that had only tiny computer monitors at their disposal. But the OptIPuter's visualization resources, known as OptIPortals, provide precisely the type of high-resolution display that scientists need to extrapolate from complex, and in many cases microscopic data.
"Traditionally, we've tried to do all of this scientific visualization by displaying our data on a tiny laptop screen and then trying to reason about it," Abramson pointed out. "What's really exciting about the OptIPortal is that you've got all this screen real estate, 50 or 100 million pixels to look at."
Abramson noted that Nimrod and Kepler have been 15 years in the works, with ongoing plans to retrofit Kepler so that it is more effective at doing computational analysis on distributed machines and organized clusters. He and his team are also developing software that will allow Kepler workflows to stream directly onto OptIPortals thousands of miles away.
"At the moment there's a hole: You do your experiment, and you get these files and then you copy them to the OptIPortal so you can look a them. But imagine if we could compute a complex environmental problem as a time-evolving video. Now we could watch as the computation proceeds, and we could even go back to steer the computation and change its parameters."
One Calit2-based project that Abramson called "a natural marriage" — because it makes use of Kepler, Nimrod and OptIPortals all together — is the Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA). Funded by the Gordon and Betty Moore Foundation, CAMERA is a state-of-the-art computational resource for deciphering the genetic code of communities of microbial life in the world's oceans — work that has implications for everything from drug discoveries to evolutionary science.
Ilkay Altintas, SDSC's deputy coordinator for Research, Cyberinfrastructure Research, Education and Development (CI-RED) and division director of Scientific Workflow Automation Technologies (SWAT) Laboratory, said that OptIPortals (and, by association, Nimrod and Kepler) will contribute to the success of the CAMERA project.
"CAMERA has countless output visualizations coming out of pipelines," she remarked. "We need to be able to dive deep into those high-resolution images. If you look deeper into an image you can see more, and these displays let us blow up high-resolution images, which is not possible in a smaller-resolution laptop interface.
"Another beneficial aspect of the OptIPortal," she continued, "is that it's a collaborative way to discuss something. Nimrod and Kepler integrate what's available to the scientist and provide a unified interface. Kepler integrates these scientific infrastructures. Before Kepler, if you were to try to unite two infrastructures from two different institutions, you would have a big problem. It provides the interface for all the middleware involved and provides a workflow interface for all the community approved bioinformatics tools that CAMERA makes use of. Kepler is a unique workflow system in that sense - it's out there to create an open community of researchers, and that collaboration brings out aspects of different tools in a very helpful way."
Looking down the line, Abramson said he foresees a major growth in the number of applications using the Nimrod/Kepler/OptIPuter platform. Monash University, for example, is working with Australian researchers to run long-range climate simulations to predict how global warming might affect agriculture in the region. The Kepler-Nimrod combination allows scientists to "look at a whole stack of scenarios," Abramson noted, from catastrophic models to those that have less dire implications.
"What you want to know is under which scenarios are various crops going to grow, and how you might change farming practices," he elaborated. "This requires quite complex ensembles of experiments to do the climate science, to do the pasture modeling. That's where the workflow technology comes in. And when it comes time to look at the results, I want to bring a farmer and a politician into a room and point to the display wall and show them the results. So, there's an educational role to all of this, too, bringing policy makers in and really showing them what will happen."
by Tiffany Fox, (858) 246-0353, firstname.lastname@example.org