"So for example in our cluster, you can run Hadoop as well as a clustered DB2 or Oracle databases," Sarkar said. "This allows us to have a general-purpose file system that [can be used by] a wide range of users."
IBM would not say when the GPFS-SNC file system would make it out of the labs and into the marketplace, but Sarkar said that once it it's available, it will be targeted at three use cases: data warehousing, Hadoop MapReduce applications and cloud computing.
"The cloud may not be intuitive of a parallel architecture, but we have [many] virtual machines on each hypervisor node, and we have a lot of hypervisor nodes in parallel. Each virtual machine is accessing its own storage independently of every other virtual machine. So in effect you're getting a lot of parallel access to storage," Sarkar said.
IBM's current GPFS technology offering is the core technology for the company's high-performance computing systems, Information Archive, Scale-Out Network-Attached Storage (SONAS), and Smart Business Compute Cloud.
The GPFS-SNC technology's ability to run real-time Hadoop applications on a cluster won IBM a first-place award at the Supercomputing 2010 conference in New Orleans this week.
Sign up for Computerworld eNewsletters.