PV-CDRR:
Proba-V Cloud Detection Round Robin

  
Minimize Motivation

The need for an improved cloud detection method for Proba-V products was raised already during the first Proba-V Quality Working Group Meeting, held in ESTEC on March 2015. This need was additionally underlined during the last Proba-V Symposium, held in Ghent on January 2016, as being one of the major issues to be addressed for improving data quality. Several presenters at the Symposium reported under-detection of clouds with the current algorithm with semi-transparent clouds representing a major concern for land cover applications as well as for surface properties retrieval. 

In order to answer to this need, ESA in collaboration with BELSPO decided to organize a Round Robin exercise on Proba-V Cloud Detection. The Round Robin will be open to any interested user with the goal to provide recommendations for the definition of the future operational processor baseline.

Minimize Objectives
The objectives of the Round Robin exercise are:
  • To inter-compare different cloud screening methodologies for Proba-V and learn on advantage and drawbacks for various clouds and surface conditions.
  • To provide final recommendations to ESA on potential best candidate for implementation in the operational processing chain 
  • To review and consolidate users' requirements within the Proba-V community on clouds clearing and decide on trade-off between under-detection of clouds and clear pixels' availability.
  • To collect lessons learnt on cloud detection in the VNIR and SWIR domain for land and coastal water remote sensing and reuse them in the frame of Sentinel-2 and Sentinel-3 cloud detection.
  • To increase awareness on Proba-V mission by inviting new scientific teams in the Round Robin exercise and organising a final workshop on project results.
Minimize Definitions
Term Definition
Validation Dataset The Validation Dataset is the "truth" data, which we are going to use to assess the quality of the different cloud detection algorithms. This dataset consists of a reletively large ensemble of pixels (several ten thousands) manually classified from visual inspection of the reference images. This dataset needs to be global and representative of different environmental conditions (clouds and surface types, different seasons). The validation dataset will be kept in a "vault" and used only at the end of the Round Robin exercise to perform the final quality assessment.
Test Dataset The Test Dataset is a representative sample of the validation dataset, which is provided to the Round Robin participants as an indicator of our pixel classification criteria and definition. This dataset is a subset, randomly extracted from the validation dataset, in which all the relevant pixel classes are adequately represented.
Training Dataset The Training Dataset is a statistically significant ensemble of pixels, which is used by the algorithm's providers to train and "calibrate" their methods. Ideally the training dataset must be much bigger than the validation dataset in order to include a sufficient ensemble of cases for training its prediction capability. The selection of a suitable training dataset will be responsibility of the algorithm's providers.

 

Minimize Round Robin setup

A summary table providing the main requirements and settings for the Round Robin exercise is provided here:

 

Requirement and Settings
General Round Robin Settings
  • Provision of input reference scenes and test dataset to the algorithm's providers
  • Validation dataset hidden to the participants, it will be used at the end to perform the final Quality Assessment
  • No provision of training dataset, each participant will have the responsibility to develop, train and test its own algorithm
  • No provision of auxiliary data (land cover maps, surface reflectance climatology)
Algorithm Requirements
  • Any type of algorithm and methods is accepted
  • The participant must be the owner and developer of the algorithm
  • Algorithm's computing performances must be suitable for implementation in the Ground Segment (detailed measures will be provided in the Protocols)
  • Accepted flags: cloudy, clear, semi-transparent clouds
Input Reference Scenes
  • 4 days (4 seasons) global dataset
  • Level 2a Proba-V products: TOA reflectance in the 4 bands: Blue, Red, NIR and SWIR projected in Plate Carrée grid
  • Spatial resolution: 333m
Validation Dataset
  • Statistically significant (more than 20000 pixels) database
  • Visually classified and cross-checked with PixBox tool
  • 30% thick clouds, 30% semi-transparent clouds, 40% clear
  • From clear pixels: 70% land, 15% snow/ice, 15% water
  • Globally spread covering the four seasons
Test Dataset
  • Randomly extracted from Validation Dataset
  • Representative of all classes and conditions
  • Pixel classification visible to the participants
Quality Assessment Metrics
  • Use pixel manual classification as reference truth and standard statistical metrics, i.e., confusion matrices and associated QA indices: UA, PA, Scott's pi, Cohen's Kappa and Krippendorf alpha coefficients
  • Use visual comparison of selected images to investigate performances in clouds structure delineation and for assessing critical cases (thin, patchy, or cirrus clouds)
  • (Optionally) Generate Level 3 daily composites to investigate impact of undetected clouds on synthesis images (e.g., spatial noise, residuals clouds contamination)
  • (Optionally) Use clouds cover information extracted from synoptic meteorological data using standard quality metrics (POD, FAR)
Minimize Schedule

The Proba-V Cloud Detection Round Robin will be performed according to this plan.

 

  Deadline
Registration to the Round Robin 13 Apr 2016
Delivery of Round Robin Protocols 30 May 2016
Preparation of the Validation and Test Dataset 30 Jun 2016
Delivery of Round Robin input data (reference images, Test Dataset) 15 Jun 2016
Delivery of Round Robin output data (cloud masks, ATBD) 1 Nov 2016
Quality Assessment Report 15 Jan 2017
Final Workshop in ESRIN 01 March 2017

 

A fixed price reward is allocated for each algorithm provider. The reward will be granted upon provision of the following deliverables:

  • Cloud masks for the reference images (4 full days of Proba-V global data)
  • High level ATBD of the used algorithm
  • Small Technical Note on the Computing Resource for the algorithm

The Quality Assessment will be performed by Brockmann Consult on the Validation Dataset and a final report will be prepared and reviewed among the Round Robin participants.

Results will be further discussed during a dedicated one day workshop in ESRIN.

Final peer-reviewed paper will be prepared summarizing the results of the inter-comparison. Co-authorship of the paper will be granted to the Round Robin participants.