Thursday, September 29, 2011

#CLOUD: "World's Largest Academic Cloud"

Not to be left behind in the dust of industry marching forward, academia recently unveiled their largest cloud computing platform, aimed at providing the petabytes of storage required for giant scientific simulations and big-data analytics.

The San Diego Supercomputer Center (SDSC) Cloud is connected to 10 other supercomputer sites nationwide on the high-speed TeraGrid network.

The world's largest academic cloud, the SDSC Cloud, serves University of California at San Diego researchers and their associates, including 10 supercomputer centers nationwide connected by the high-speed TeraGrid.
Hosted at the San Diego Supercomputer Center, the SDSC Cloud offers academic researchers the resources to run gigantic simulations and analytics that they could not afford to support on commercial cloud providers.

The San Diego Supercomputer Center (SDSC) Cloud is connected to 10 other supercomputer sites nationwide on the high-speed TeraGrid network.
"The SDSC Cloud may well revolutionize how data is preserved and shared," said Michael Norman, director of SDSC. "Every data object has a unique URL [universal resource locator] and can be accessed over the Web."
The Web-based storage array is capable of 8-to-10 gigabyte per second sustained transfer rates over 768 Ethernet connections each running at 10-Gbits per second. Storage capacity today is 5.5 petabytes and is expected to grow to hundreds of petabytes as it scales linearly with each added resource.
Conceived in UC San Diego's Research Cyberinfrastructure (RCI) project, the initiative grew in scope to now include UC San Diego’s Libraries, School of Medicine, Rady School of Management, Jacobs School of Engineering, and SDSC research faculty doing federally-funded research projects at the National Science Foundation, National Institutes for Health, and Centers for Medicare and Medicaid Services. All these centers can now share data sets in the same SDSC Cloud.
"The SDSC Cloud marks a paradigm shift," said Richard Moore, SDSC’s deputy director. "One that says 'if you think your data is important, then it should be readily accessible and shared with the broader community'."
The key to the SDSC Cloud's ease-of-use is a program written for large NASA data sets by Rackspace called the OpenStack Swift Object Storage app. OpenStack organizes files into objects that are written to multiple physical storage arrays simultaneously, keeping at least two verified copies on different servers at all times. Also a Cloud Backup package uses SDSC's CommVault Backup service with continuous automatic data verification and integration with commercial cloud providers Rackspace and Amazon's S3 to allow a third copy to be replicated off-site for increased security.
Next month the SDSC Cloud will also start transferring huge data sets over simultaneous multiple 10-Gbit per second connections to CENIC (Corporation for Education Network Initiatives in California), ESNet (Energy Sciences Network), and XSEDE (Extreme Science and Engineering Discovery Environment).
The SDSC Cloud can also make use of the other advanced supercomputer services at the San Diego Supercomputer Center including its Data Oasis which can transfer three terabytes of data per minute, and Gordon--the world's first supercomputer to integrate large flash-based SSDs (solid state drives) for six terabyte per minute transfers.
Further Reading