Scientific Big Data

Big Data


900TB: Scientific Data Protection

Historically, massive data sets have been managed by ever expanding disk based solutions.
However, when the growth of that data outpaces the advances in disk based technology, a change of strategy is required.

Business Challenge for a world leading Scientific Organisation:

  • 20 Years of data had been ‘locked’ away on various Storage Systems
  • 10’s of TB’s were stored on DVD drives
  • 100’s of TB’s of data were locked away in Proprietary Backup tapes.
  • 100’s of TB’s available online on Spinning disk
  • 900 TB Total
  • Growing at 200TB per annum
  • Real risk of data loss
    • though system failure
    • media degradation
    • media loss

Business Need:

  • < 150TB working set of data was needed for instant millisecond access

CD DataHouse Implementation

  • A Petabyte Tape Library (with expansion to 4 Petabytes possible)
  • Online Active Archiving Technology
This gave the ‘end users’ access to an infinitely scalable file system where:
  • the First few Hundred TB’s resides on disk for instant access
  • the data on disk is the most recent or recently accessed
  • and the remainder of the data resides on Tape – with a 90 second access time
  • all unified under a CIFS/NFS network share

The system has allowed the customer to:

  • Remove their dependency on inefficient backup
  • Secure any data on disk, as data written to disk is ‘shadowed’ to Tape a few hours after creation
  • Neutralises any dependency on specific disk technologies
  • Allows seamless migration and preservation of the data across any disk, tape or other future magnetic media
  • Disaster Recovery – Petabytes of data can be brought back online quickly without having to wait for a ‘classical’ backup restore’ that might take weeks.

Due to the expert consultation and implementation by CD DataHouse, the capabilities of this scientific organisation have dramatically improved. The project has delivered an infinitely scalable file system that offers users instant access, efficient Data Back-Up and seamless Disaster Recovery. The infrastructure supports the big data needs of the business today and has additional (4PB) capacity for future volume needs.

    For further information on this project, enter your email address here:: How much data do you need to Store or Archive

    Project Outcome:

    • New Petabyte online Active Archive
    • Legacy Data consolidated
    • Data Presented to end users as a familiar network share
    • Priceless Data stored on disparate media
    • System will adapt and scale to future needs