Friday, January 07, 2005

Grid Computing: Supercomputing with PCs

Think of supercomputers and you think of gigabuck mainframes that easily fill a room. Names like Cray, IBM's Deep Blue and Blue Gene, and the NEC based Earth Simulator. Performance is measured in the Teraflops or 1,000 billion floating point operations per second. That's a lot of number crunching. But you need that kind of processing power to solve the really tough problems, especially simulations like nuclear reactions, weather simulation, oil exploration, earthquake prediction, designing new drugs and unraveling the genetic code.

Originally, supercomputers were based on proprietary processors. Some ran so hot they needed to be liquid cooled or they'd melt down like a runaway reactor. A newer wrinkle is to create clusters of microprocessors or microprocessor based PCs. You can get more MIPs for the buck, that millions of instructions per second, with standard off the shelf equipment manufactured in quantity than by commissioning a brand new proprietary CPU design.

Now take that concept one step further. Instead of renting a gymnasium sized room and filling it with hundreds or thousands of Apple G5s or Pentium based PCs, why not spread the computers around and provide modest amounts of electrical power and air conditioning at each site? In fact, why not just make your supercomputer by tapping into some of the millions and millions of PCs that are deployed already?

That kind of lateral thinking approach to scrounging the processing power you need without having to invest in all new equipment actually works in practice. What makes it possible is that most of the computing power in the world is doing nothing at any given time. What are your PCs doing right now? Unless they're rendering complex graphics or recalculating spreadsheet formulas, they're probably just sitting there waiting for someone to think about what they are going to type on the keyboard. Or, they might just be idling on a desk, running a screen saver while their user is off at a meeting or at lunch. Millions and billions and trillions of cycles going to waste while running up electric bills everywhere.

So, you know where the extra processing time is. Now how do you get to it? Within an organization, the answer is over the local and wide area networks that belong to the company. On a grander scale, there are millions of computers idling on their broadband Internet connections right now, picking their electronic teeth while they wait for somebody to tell them to download something. The Internet provides universal connectivity. What's needed is a program to marshall these resources.

Grid computing works with problems that can be broken into smaller packages so that many independent computers can do the processing. Solving the problem is managed by scheduling software that doles out the packages and collects the results. Other management software puts all of the pieces together to create the final result. Individual computers all running the same client software can be located down the hall or in the far corners of the world as long as they have an Internet or private network connection.

A well known pioneering public grid computing or distributed computing project is the SETI@home. SETI stands for Search For Extra Terrestrial Intelligence. The @home part involves recruiting home computer users to provide the client computers in the grid. Typically, there are over 600,000 users who are donating the unused processing power on their computers to help SETI analyze the flood of data coming from its radio telescope at Arecibo in Puerto Rico. The combined processing power amounts to about 70 Teraflops, the performance of a respectable supercomputer by anyone's standards. But nobody had to pay for this supercomputer. The users download a screen saver program that contains the code necessary to communicate with SETI and run the needed processing.

Google is promoting another public grid computing program called Folding@home. This is a research project by Stanford University to study the way that proteins fold or self-assemble, such as in the human genome. They've got 160,000 donated processors working on this one right now, including the PC I'm using to write this post. As soon as I take a break, the processor will go back to working on a protein folding computation.

Grid computing isn't limited to combining the resources of Internet users for public research projects. Corporations could conceivably tap into the thousands of desktop computers on their networks that have already been capitalized or expensed. For time critical applications, it might make sense to simply use dedicated processors in a cluster or grid to create a virtual supercomputer rather than buying a monolithic supercomputer. In the future, outsourced supercomputing may become more common as service providers create grid networks of inexpensive or borrowed processors that can be rented by the hour.

If you need bandwidth to connect your own computing grid or high speed dedicated Internet access, get all the speed you can use at very attractive prices from T1 Rex.

Click to check pricing and features or get support from a Telarus product specialist.




Follow Telexplainer on Twitter