Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

5 ways to prepare for Big Data with Scale-Out NAS

Thor Olavsrud | March 13, 2012
As enterprises seek to move into the big data world--digitizing paper documents and saving email communications, Word docs, Excel files and all sorts of other unstructured data with the hopes of mining them for actionable business intelligence--they need to address a big problem up front: storage.

Simple to scale. "This next generation architecture that they're looking to move to needs to be simple to scale," Kirsch says. "If I have a 1TB drive, that's a volume that I can manage, I can protect and I can replicate. Why can't I manage 15 petabytes with that same simplicity? It shouldn't be more complicated just because it's bigger." Scale-out NAS architectures can tackle this problem with software management and a virtualization/abstraction layer that makes the nodes behave like a single system.

Predictable. "The performance needs to be predictable," Kirsch says. If I add 6TB this week and 6TB next week, I want that same linear scalability in terms of performance. I don't want to have to re-architect my application or re-educate my users. It should just scale in a predictable fashion. I want it to be pay as you grow. Don't make me overinvest today. I know that Moore's Law is going to give me faster computing next month and that drives are going to get denser over time. Let me take advantage of that in my storage infrastructure. And please, let this be shared symmetric architecture. Don't force me to understand differences in your architecture. Allow me to scale this system as I need it."

Efficient. "Let me leverage all the resources in my storage system, regardless of where they are," Kirsch says. "Let me get great utilization out of my physical disk drives, not 50 or 55 percent, but over 80 percent of that storage should be utilized for my data.

Regardless of where the CPU is or the compute or the cache, let me take advantage of that. Whether the application over here is hot or the application over there, I want the storage system to maximize the performance of that application. And please, integrate tiering into this system." In other words, you should have to move data around to optimize performance or optimize capacity. Scale-out NAS for big data needs to be intelligent enough to automate that for you.

Available. "This has to be available all the time," Kirsch says. "Take advantage of an N-way architecture. Allow me to survive more than two failures. Allow me to survive when a rack goes down in my environment. I want this to be on all the time. And let it be flexible. Let me align the availability of the protection of the system with the needs of my business units. If they're willing to invest more, I can give them greater availability. If the data is less valuable, I can give them less availability." Boiled down, since a scale-out NAS storage infrastructure is built on commodity hardware, there's an assumption that hardware will fail and the system has to be designed to deal with a higher rate of hardware failure.


Previous Page  1  2  3  Next Page 

Sign up for Computerworld eNewsletters.