Big Data file systems compared... Mirror mirror on the wall, who's the most suitable of them all?

In this expanding blog item I want to explore the pros & cons of the different candidates

IBM has a very interesting proposition with their GPFS (General Parallel File System):

The IT universe is seeing a massive collision taking place as the worlds of high-performance computing, big data and warehousing intermingle. IBM is pushing its General Parallel File System (GPFS) further to broaden its footprint in this space, with the 3.5 release adding big data and async replication features as well as customer metadata and more performance.

GPFS is a large-scale file system running on Network Shared Disk (NSD) server nodes with the file data spread over a variety of storage devices and users enjoying parallel access.

Of course a serious drawback of any 'closed' solution is the fact that you're in (and at) the hands of the vendor...

With the release of GPFS V4.1 IBM has added the following extra's:

  • Improved data integrity
  • Improved application performance during array rebuild after disk failure
  • Enhanced data protection
  • Improved data access when hardware failures occur
  • A GUI for: Health and performance monitoring, directed maintenance procedures

GPFS Native RAID for GPFS Storage Server, V4.1 is required for licensing the software needed to complete the GPFS Storage Server solution on the Intelligent Cluster.

System x GPFS Storage Server:
GPFS Storage Server is a single, integrated, supported IBM solution, built to leverage the GPFS software market. It offers performance on a scalable building-block approach; performance and capacity increase as you add GPFS Storage Servers.

In 2012 IBM started to be a seriously interesting option when it launched the V3.5 version:

http://www.theregister.co.uk/2012/05/21/ibm_general_parallel_file_system_3dot5/

Author: Anonymous

Copyright © 2012 Ronver Systems | sitemap | site by MoonWorks