Performance and Cost of Cloud Cold Storage for Astronomy Data Archive and Analysis

The adoption of cloud cold storage should be considered in order to reduce the TCO and the labor of storage management of maintaining large amounts of research data for long periods. Acquiring information on performance, manageability, and cost, and establishing best practices through PoC activities are crucial for adopting cloud cold storage.

Research Goal

The goal is to acquire practical information for making decisions about storing research data in cloud cold storage and designing an overall data storage architecture by conducting experiments using the cold storage services of multiple commercial public clouds. The PoC is focused on case study analyses of storing actual research data and accessing them through research applications.

High energy physics: Belle II experiment physical simulation data provided by KEK [1]

Astronomy: Data of ALMA and Nobeyama Radio Observatory provided by NAOJ [2]

For the astronomy data, we have also started an additional PoC where data analyses are also performed inside the cloud using server instances.

PoC Configuration

Example of PoC Results

Performance of uploading astronomy data

Upload performance

Estimation of cloud charge in a hybrid configuration

(1) Tiered storage including Glacier, S3 IA, and on-premise storage

NGAS configuration

(2) Estimation

Cost estimation

Performance and Cost Evaluation of Public Cloud Cold Storage Services for Astronomy Data Archive and Analysis


We would like to thank the PoC members of KEK and NAOJ for providing data and support.

