Pervasive Software is unveiling on Wednesday version 5.0 of its DataRush parallel application software, which now works with the popular Hadoop MapReduce framework for processing large volumes of data in parallel.
Functioning with the JVM (Java Virtual Machine), DataRush helps developers build parallel applications without requiring expertise in parallel development, the company said. "The idea is to take a pure programmer off the street and enable him to write multithreaded apps," said Davin Potts, Pervasive director of product management.
[ Also on InfoWorld.com: Apple has quietly joined the ranks of Hadoop users | Keep up with the latest developer news with InfoWorld's Developer World newsletter. ]
"DataRush is an API you would use in your normal application development. It's just another library that you access," Potts said. MapReduce backing helps developers get more performance out of their MapReduce cluster. "You can get the same [query] answer in less time. You're being more efficient in how you use your cluster," Potts said.
DataRush scales across clusters, with the ability to accelerate every node in a Hadoop cluster. At data marketplace Infochimps, a DataRush user site, the company is using the software in a pilot effort to run Hadoop programs. "DataRush will coordinate shuttling the data around and gets you the concurrency," said Infochimps CTO Flip Kromer.
"Computer scientists [have] done a terrible job of letting us use multicore programs efficiently," Kromer said. "Programming concurrency is really hard. DataRush lets you bring all that performance out," while keeping programs simple, he said, though he also noted that developers still must adhere to DataRush primitives, tracking back to the DataRush data flow language.
Also featured in DataRush 5.0 is backing for newer languages on the JVM, including JRuby, Python, and Scala; users of these languages get parallel development capabilities. DataRush also can access data in data warehouses, databases, and flat files.
Pricing for DataRush 5.0 is based on factors such as use of perpetual or subscription licenses, contract terms and number of machines in a cluster, Pervasive said. Free trial downloads of DataRush 5.0 can be accessed at the Pervasive website.
Sign up for Computerworld eNewsletters.