Satellites and probes continuously send unique pictures and data readings. After capture, material is processed, archived, secured, and made usable by many applications. As new mathematical models or measurements are developed, data is reprocessed annually.
The European Space Astronomy Centre (ESAC), the ESA division where “space” indicates data storage capacity, has that objective. NetApp storage as a service supplied its capacity.
“Here, the characteristics of the storage infrastructure aren’t like anything you encounter elsewhere,” said ESA IT director for science and operations Rubén Alvarez.
The ESA station in Madrid stands atop the storage infrastructure, where space data comes, virtualized and containerized servers expose it, and research centers get fresh insights.
Producing archives
That storage infrastructure is 8PB. However, satellite-borne measurement equipment data grows exponentially, requiring ongoing capacity expansion.
The Gaia project, which has been creating 3D Milky Way pictures since 2013, will contribute up to 3PB by 2025. Euclid will analyze dark matter around 2024 and yield 20PB by 2030.
Rosetta, which arrived on a comet in 2014 to collect data for two years, has only produced 218GB. However, transmitting its data 400 million kilometres back to Earth with the limitation that it can’t take a second reading if it wasn’t saved correctly is a different difficulty.
The ESA’s “library of the universe” stores data in archives rather than “hot data,” a unique storage method. Technically, the goal is to combine large capacity on spinning disk hard drives, which are slower and more brittle than SSDs, with excellent reliability and the ability to handle 18,000 users each month.
New data joins ESA data from 1999 that the worldwide space research community utilizes daily. European best practice requires scientific journals to link data sources.
Because the ESA has so many files, access is complicated. Alvarez opposes tiering storage.
“We don’t use the public cloud, except for point requirements, because sovereignty in our own datacentre in Madrid fits with the values of a European public agency,” said Alvarez.
The ESA is not an IT company. Its mission is space exploration, hence the IT director has less resources. We require storage equipment that streamlines administration.”
NetApp simplifies work.
Since 2005, ESAC has employed NetApp arrays for data library and application storage.
“We have a cluster that holds data for everything,” Alvarez added. That’s the best method to streamline IT staff’ job and control complexity.
“We didn’t choose a vendor from the start. We sought the most dependable and manageable storage systems. NASA colleagues informed us they use NetApp, so we did too.
NetApp’s support is unwavering. We need a vendor now. We paid for storage by usage. We pay NetApp for a reliable storage solution with the volume we need.”
Alvarez said, “Maintenance is not only physical intervention to add or replace disks, or shelves of disks. We must upgrade controller firmware and array OSs to ensure data integrity. You can’t command satellites to stop providing data or researchers to wait.”
Functional evolution.
Technical qualities and capabilities need to evolve.
“For instance, most of our data is in file format because the scientific community mostly accesses it,” said Alvarez. “But we have seen demand for object protocols and have started a slow transition in this direction.”
NetApp snapshots and four third-party solutions backup data. These write files and activate with the smallest data corruption.
“Our data is designed for sharing among a large number of people so it’s better if anyone can read it,” Alvarez said of access limits. Storage isn’t our main concern. Cybersecurity worries us less than others.