You are here

Object Storage: New Options For Storing and Accessing Your Research Data

Over the next two years, new national storage infrastructure is coming online through Compute Canada’s national platform renewal. Specific storage technologies will include cloud­-based services, traditional filesystems, and object storage software.

WestGrid will be offering various user training sessions to explain these storage features and capabilities once they are available, particularly for object storage, which will be a new service offering within Compute Canada. In the meantime, we’ve compiled a brief overview of the capabilities and benefits of Object Storage systems

What is Object Storage?
Also known as object-based storage, this data management approach stores data as objects. An object is simply a data set with attached descriptive metadata and a unique identifier. This straightforward approach allows an object storage service to provide:

  • Universal access through the unique id
  • Extensive search interfaces through the metadata
  • Inexpensive design options using commodity infrastructure building blocks
  • Easy-to-scale physical infrastructure
  • Reliable and flexible architectures including extensive redundancy and geo-replication capabilities.

Mainstream examples of object storage services include Amazon’s Simple Storage Service (Amazon S3) and Microsoft Azure’s Blob Storage.

How is Object Storage different from other forms of storage?
Unlike File Storage, where data is managed as a network of file hierarchies, Object Storage uses a simple, flat structure. This is very attractive for large unstructured data sets, and is particularly well-suited for scientific experimental and observational results which require reliable and accessible storage.

Also, while both Object Storage and Tape Storage can be used for archival purposes, Tape Storage is meant for storing long-term data that is infrequently accessed. Object Storage is best used for data that needs to be:

  • always online and accessible
  • distributed and geo-replicated

How will Compute Canada’s Object Storage work?
Compute Canada’s Object Storage service will be built around these key characteristics:

  • Nationally distributed across four sites with geo-replication capability
  • Local storage building blocks for scalability, redundancy and performance
  • Integrated with CC’s authorization and authentication services (single sign-on)

How can Object Storage support research data?
Within our research community, groups like the Canadian Advanced Network for Astronomical Research (CANFAR) and Ocean Networks Canada (ONC) are constantly accumulating large sets of experimental and observational data. These data sets often come in as objects (i.e. videos, images, etc.), they’re not naturally hierarchical, and they need to be easily accessible by other researchers and collaborators, who for instance may only want to view a particular time series within the entire set.

Who should use Object Storage?
Example use cases for this new service include:

  • Users who are undertaking production runs of a particular experiment, and would like to store the resulting output together with metadata describing the details of that particular run. They require storage that is:
    • Reliable (their data is valuable and difficult to reproduce)
    • Accessible (to the researcher, colleagues and/or a broader community)
    • Available (random access patterns as researchers search for and extract datasets of interest to their particular research)

  • Scientific portals where a user can search, extract and analyze particular data sets. In this case, an object storage service is well-suited for the underlying storage as it provides excellent availability and reliability through the redundant building block architecture.

Ultimately, the potential use cases span all disciplines, and we expect researchers in a range of fields -- from environmental monitoring to life sciences to digital humanities -- to benefit from this new Object Storage service.

When will these new Object Storage systems be available?
Object Storage is one component of multiple infrastructure investments happening over the next two years at Compute Canada’s four national sites:

  • University of Victoria
  • Simon Fraser University
  • University of Toronto
  • University of Waterloo

Currently the storage software has been acquired, and a prototype for internal training and testing has been created. We expect an initial, beta service to be available later in the winter when we have the physical infrastructure in place.

How can researchers access this new storage resource?
The process for allocating these new Compute Canada storage resources will be shared in more detail over the next few months as these new services come online. Overall, users can expect an access and allocation process similar to that currently used for existing compute and storage resources. Researchers must have a Compute Canada account and can apply for resource access either through Compute Canada’s Rapid Access Service or by participating in one of the annual Resource Allocation Competitions.

More Information:
If you have general questions about national platform’s incoming storage infrastructure, visit Compute Canada’s FAQ page or email support@computecanada.ca. Otherwise, stay tuned for more information from WestGrid about the upcoming Object Storage service and how it may support your research.