SAFE (Standard Archive Format for Europe)

Activity

Overview

SAFE, the Standard Archive Format for Europe, is designed to act as a standard format for archiving and conveying Earth observation data within the European Space Agency (ESA) archiving facilities and, potentially, within the archiving facilities of cooperating agencies.

SAFE benefits from experiences gathered during developments of several existing archiving standards and copes with the major constraints defined (in the OAIS-RM, see next paragraph) for the packaging and long term preservation of Earth observation data.

SAFE has been designed to be used in an archive system compliant with the Open Archival Information System (OAIS) standard. Therefore, the OAIS reference model is a major input to its design. In fact, SAFE provides features that facilitate the archive compliance with OAIS and at the same time ensures that this compliance is not hampered by it.

One of the SAFE features originating from the OAIS Reference Model is the essential use of the XFDU packaging format. XFDU is a CCSDS standard that provides a set of predefined metadata categories and classifications that are fully compliant with the OAIS reference model. The departure from the XFDU packaging standard has been minimised, considering that the goal of XFDU is to provide a standard format for archiving and conveying scientific data.

The metadata model defined by the OGC Earth Observation Metadata Profile of Observations & Measurements (OGC EOP O&M) has been adopted as the reference metadata profile in SAFE 2.x to describe specific missions' EO Products. This metadata profile allows for a better exploitation of the SAFE format in any interoperability framework using standardisation.

Background

The European Space Agency has been managing the payload data operation of a number of Earth Observation (EO) satellites since 1975. The activity includes acquisition, archiving, processing and product distribution of data from ESA and Third Party missions, for which more than five PetaBytes of data are presently archived.

The normal mandate for ESA's EO missions is to maintain the archive for at least ten years after the end of the mission. The management of this vast amount of heterogeneous datasets poses problems for their long term preservation, and this is why the EO Ground Segment Department has recognised for many years the need to establish a clear strategy for the management of the long term archives.

In 2001, ESA's Programme Board on Earth Observation (PB-EO) endorsed a strategy for the "Management of Historical Archives" (ref. PB-EO(2001)4). In 2003 DOSTAG (the technical advisory board of the ESA's PB-EO) endorsed a document ESA/PB-EO/DOSTAG(2003)6 related to the promotion of products and services across missions and exploitation of historical archives. Furthermore, in 2003 the Oxygen implementation plan (ESA/PB-EO(2003)51) has reinforced and enhanced in a wider context the overall issue of data archive and improved data sources user access and availability.

Finally, in 2004 the ad hoc nominated PB-EO Ground Segment Task Force concluded its report with a set of recommendations (ESA/PB-EO(2004)53) related to the facilities and archives infrastructure aiming at:

  • Maximised competitive approach
  • Enhanced infrastructure exploitation
  • Facilities and operations rationalisation
  • Technology exploitation
  • Cost reduction
  • Possible standardisation and re-utilisation at European level of Agency investment

ESA's Earth Observation Department has since that time recognised that the main process to be undertaken is the standardisation and harmonisation of its ground segment architectures to reach scale economies during development, operations and maintenance. This includes the need to achieve the following goals:

  • Archive maintenance in order to ensure data integrity
  • Archive and data management rationalisation
  • Data conversion to new technologies in order to reduce cost of operations
  • Enhancement of data access
  • Standardisation of the format in which the datasets are preserved

By achieving these goals, the road would be also paved for simplifying the exchange and interoperability of data between ESA and external operators.

It is well known that the data to be preserved for the long term require a special attention, which is reflected in costly operations for their exploitation and maintenance. Among these challenges:

  • the datasets have to be regularly converted into new media technology, to prevent the problems created by their obsolescence;
  • since the long term archive is normally based on datasets archived usually only up to a very low processing level, normally called L0, higher level products have to be generated by processing systems;
  • in addition, in the case of distribution of the data holdings directly in the archived format, it is normal practice that they are converted or reformatted into a format more oriented to the end-user utilisation;
  • it is a common requirement to have to extract from the long term archive a portion of the single data file (subsetting) to create a "child" product, optionally to be pre-processed;
  • finally, more and more data are distributed to end-users and exchanged among data holders in electronic format over network infrastructure (private intranets, public internet, academic networks, etc.).

One of the reasons that contribute to the high operations and maintenance costs of the long term archive is the excessive proliferation of diverse and heterogeneous data formats, caused by mainly three reasons:

  • The lack of an agreed standard in the Earth Observation community, reason for which the formats tended to be specific for the sensor(s) each missions carried on board.
  • Legacies from old ground segments architectures, which tended not to reuse elements previously developed.
  • The non-mature status of the information technologies and standards used to describe and package the data, preventing the creation of a unique format able to satisfy at the same time the requirements for the long-term preservation of the data and their handling in the processing centres.

 

Taking all this into account, in early 2004 ESA set up a project called HARM (Historical Archives Rationalization and Management). The HARM project developed the initial version of a new format named Standard Archive Format for Europe (SAFE).

In recent years, with the support of GMV, SAFE has evolved in order to be aligned with other standardisation activities (e.g. in the HMA and OGC contexts) and technology developments (e.g. DFDL) and to take into account the feedback on the first version.