Implementation

Highly efficient Earth Observation services

Edge SpAIce's ambition

Ingénieur travaillant sur ordinateur en gros plan

Edge SpAIce has the ambition to demonstrate an innovative technology to pave the way towards creation of highly efficient Earth Observation (EO) services by deploying AI-analytics at satellite for efficient data valorisation and reduced downlink data load and latency. Furthermore, make it with European edge-AI computing device.

Towards this end, Edge SpAIce will develop a highly accurate use case demonstrator DNN for efficient marine plastic litter detection in hyperspectral satellite images based on publicly available database (See Innovation 3 below).

Then Edge SpAIce will improve the capabilities of AGS’s proprietary DNN distillation tool (ODiToo), allowing at least 50x size reduction of DNN with minimal accuracy loss for deployments on SoC FPGAs used on satellites.

Edge SpAIce then will extend HLS4ML tool capabilities to support developed DNN deployment to European FPGA (NanoXplore) and a target satellite FPGA (Xilinx).

Thus Edge SpAIce will eventually demonstrate forefront and highly competitive performance on the system deployed for in-orbit remote monitoring of plastic litter in marine environment, demonstrating clear technological advancements beyond the state of the art. As well as achieve TRL 6 for European-based edge-AI FPGA application.

Successful DNN architecture downsizing and accurate detection capabilities demonstration will represent a challenge throughout the whole implementation and realisation of the project.

At the same time they will resemble an ambitious goal for Edge SpAIce to demonstrate a highly effective pipeline for Edge-AI system deployments.

Innovation here truly builds upon facets of competencies of all partners of consortium.   It is a true grand cooperation to lift off this aspirational goal nobody has achieved before.

To start, National Technical University of Athens aims at leveraging on the MARIDA22 database as well as other publicly accessible datasets to create and eventually deliver the most accurate and efficient DNN surpassing current resolution limitations on SOA marine plastic litter detection systems. 

As such, Edge SpAIce will build novel multispectral datasets by training the developed DNN on transformed Sentinel-2 images representing target satellite camera, while progressively switching to the consortium’s actual target satellite BALKAN-01’s images

Thus, developed datasets through Edge SpAIce will benefit from a very high-resolution sensor of 1.5 meter GSD with its bands aligned to those of the Sentinel-2 satellites in the visible plus NIR domain (400nm-900nm). This band alignment allows low-effort interchange of training datasets between Sentinel-2 and target satellite, thus allows building on top of research results achieved with Sentinel-2. 

To increase DNN precision further on, laborious effort will be made towards constructing new multispectral datasets with the targeted multispectral camera along with dense annotations indicating litter and other competing sea surface features

And last but not least, Edge SpAIce will deal with complexity of geometric, radiometric and atmospheric corrections and compensate those in the raw camera images to improve plastics detection capability.

Finally, EnduroSat will dedicate a small space of its BALKAN-01 satellite capacity, a 16U nanosatellite developed under the BALKAN constellation program (with intended support from ESA/ESRIN), for the purpose of monitoring ship activities in deep seas.

As the satellite will be equipped with Xilinx FPGA and Edge-AI computing software, it will provide the perfect opportunity to demo new edge-AI applications in space, requiring very high-resolution multispectral camera and edge computing. 

Furthermore, EnduroSat will leverage its expertise to demonstrate efficient marine litter detection service operation feasibility with capturing->processing->downloading pipeline instead of SoA which is capturing->downloading->processing.

This involves the development of software enabling the selection of specific regions for plastic monitoring, systems for planning satellite payload and power usage, capturing, analyzing using edge-AI, and storing specific processing outputs onboard, as well as downlinking and unpacking data on the ground at corresponding geo-coordinates.

While the Edge SpAIce application will be showcased on just one satellite throughout this project, EnduroSat plans to deploy a constellation of approximately 100 satellites in a helio-synchronous LEO orbit, each equipped to execute such applications. These new space satellites come at a significantly lower cost compared to those designed for long-term institutional missions. 

 

 

Endurosat 16U platform
First Step

EDGE-AI SYSTEM DEVELOPMENT AND OPTIMIZATION

National Technology University of Athens

In the first phase of the project, NTUA aims at leveraging the MARIDA database as well as other publicly accessible datasets to create and eventually deliver the most accurate and efficient Deep Neural Network (DNN) surpassing current resolution limitations on State-of-the-Art marine plastic litter detection systems. To do so, NTUA will implement all required software modules for dataset spectral alignment and harmonization. The employed dataset will have a global geographical distribution in order to tackle generalization and scalability aspects. 

 

The goal is to deliver a set of cutting-edge detection solutions that can discriminate marine plastic litter against a number of competing sea surface features and water-related classes (like waves, wakes, ships, etc) and not only binary classification maps, i.e., litter or not litter. In the framework of this task, high-end DNN architectures will be employed targeting the highest detection performance on GPU clusters instead of lightweight models for edge-AI. A first version of this DNN, called here reference-DNN, will be developed to allow Agenium Space and CERN to work on their software improvements for ODiToo and HLS4ML accordingly.

Agenium Space

Agenium Space is an experienced company providing tools to support development of novel Earth Observation services through Deep Learning.

Within the Edge SpAIce implementation activities framework, Agenium Space aims at expanding the capabilities of its proprietary ODiToo software to support big DNNs distillation and architecture optimization to enable deployment of highly complex AI technologies on EO satellites.

Agenium Space aims to ease the hard requirement of distillation to use full database of training data by generating matching samples directly from the input neural network. Agenium Space intends to use state of the art methods for natural pre-image estimation in the case of EO, which usually inverse the function embodied in the DNN, after reviewing available methods and approaches in scientific literature.

Upon distillation, the final DNN architecture will be shrank down to 1M architecture with less than 6% accuracy lost over the process (3% lost due 60-fold parameter count reduction, and 3% lost due quantization to 1byte weight resolution).

An edge-DNN after such transformation deployed to e.g. Intel Myriad2 VPU can process 172k pix/sec/Watt and at e.g. to Zynq Ultrascale+ do 176k pix/sec/watt. Thus, the later FPGA would take less than a second to process a 6Mpx image with 10W power consumption.

To do so, AGS will add to its ODiToo the following functionalities:

  • DNN Quantization Aware Training (QAT) capabilities – this will allow to move from static to dynamic quantization as noted in SoA (refer to SoA in I1). Current approach is doing distillation first and quantization afterwards (called static quantization) which results in a noticeable loss (3% as noted in SoA) in resulting DNN performance due quantization process55. Although this process uses a calibration approach to minimize the discrepancy between the quantized activations and the real ones, measured performance loss is 1% to 3%. Therefore, in the process of distillation, while training a student-DNN from a teacher-DNN (source-DNN) is taking place, the student-DNN computing type restriction will be added, which is expected to result in lower quantization loss (called dynamic quantization during teaching). In Edge SpAIce this addition will reduce loss of DNN accuracy during distillation process.
  • Hybrid distillation without full source of training dataset – this will allow to perform distillation without access to full training dataset. As explained before on distillation process (I1 SoA), it uses teacher and student models and requires full dataset teacher has been trained with. Which often can be unavailable. Where GANs can help here is to create synthetic datasets to train student model based on a few training data provided. This would enhance training quality, especially in cases when source data is difficult to acquire, like all possible forms of marine litter in ocean. In Edge SpAIce this addition will help to train DNN at higher quality.

 

  • QONNX standard support addition – this will ensue smooth DNN model data exchange between ODiToo and HLS4ML and thus within Edge SpAIce will unlock the use of HLS4ML for deploying distilled DNNs on target FPGAs.
  • Sub-byte resolution capability – this will allow to select a more optimal trade-off between architecture size and number of parameters. Rarely sub-byte precision is supported, e.g. VITIS AI is missing such feature very relevant for edge-AI applications in space. Using fewer bits per computation allows to dedicate more logic cells to the structure of the network and potentially to use bigger networks for the same size FPGA. E.g., using a fully unrolled DNN on a logic part using 8bit integers is taking as much space as a DNN of almost a double the size 4bit integers. For Edge SpAIce this addition will allow higher DNN size selection with weights below 8 bits.

 

These Functionalities will be exploited by Agenium Space to define and develop quantized distillation with the different type of precision targeted: 8 bits, 7 bits, 6 bits, 5 bits, 4 bits, 3bits and binary (WP3). These methodological improvements will come from literature review and extensive benchmarks.

CERN

CERN has developed open-source HLS4ML tool to unroll basic DNNs to FPGAs.

CERN will familiarize itself with the documentation of Xilinx and European FPGA manufacturer chosen by EDS in WP6 and develop efficient VHDL code to enable unrolling DNNs on those architectures.

CERN will develop High Level Synthesis (HLS) implementations of dataflow NN inference using a suitable framework capable of targeting the target FPGA. This framework could be Catapult HLS (Siemens), PandA or Bambu HLS (open source), or another as yet unidentified. Exploration of both of these toolchains for HLS4ML has started, but not matured to the required capability of large, sophisticated image segmentation models operating on large images.

To support CERN, Agenium Space will elaborate on future edge-AI computing needs and relevant requirements to SW tools (routing, synthesis, etc.), define hardware requirements to support operational needs (logic cell architecture, radiation tolerance, operating temperature, etc.) and look out for the best fitting European FPGA solutions.

Currently Europe has only single company manufacturing FPGAs for space, the NanoXplore and it offers only a single version of SoC-FPGA named NG-ULTRA.

Resource reuse is a tuning handle of HLS4ML that controls how many times one FPGA resource is used in the key matrix products of DNN inference. A lower resource reuse enables lower inference at the cost of more resources, while a higher resource reuse saves resources at the cost of higher latency. This parameter can be tuned for each layer individually, having the largest impact for the NN layers latency that perform the most computation (typically the first and last for an image segmentation model).

Dataflow FIFOs are the memories inserted between NN layer compute sub-units when using the streaming image mode. The size of these memories can be optimized in a procedure of simulating the FPGA gateware in a clock-cycle accurate simulator, and monitoring the maximum occupancy.

The NN can then be resynthesized, setting the dataflow FIFO memory sizes to the maximum observed in the simulation, and hence reducing the resource usage.

CERN will use Image tiling technique to split the input image into smaller sections (tiles) and perform NN inference on these tiles separately. This may help to deploy larger model sizes since the input image size consumes FPGA resources (notably memory) that could otherwise be used for NN parameters.

CERN will also adopt a Pixel streaming technique in HLS4ML sending all channels of one image pixel into a data-stream simultaneously. This enables fast dataflow since multiple data are propagated simultaneously, however it may inhibit deployment of large model sizes since all these data must also be processed simultaneously, consuming large FPGA resources. Streaming each channel into the data-stream serially would slow down the dataflow, but require fewer parallel resources for computation, which may increase the model size that can be deployed.

Current HLS4ML designs deploy the NN as one monolithic computational unit (or IP), with data streaming between NN layer sub-units within. For very large models, the synthesis of these large units can yield suboptimal results, for example due to propagation of state-machine and control signals across the entire computational unit, lowering the achievable maximum clock frequency. By breaking the monolithic design into individual IPs for each sub-unit (or a small number of sub-units together), the control signals may be more localised, enabling a higher eventual maximum clock frequency and hence higher processing throughput. A second effect is that the synthesis time (the time between producing a Neural Network and producing the artifacts to deploy it in an FPGA) may be reduced, shortening the turnaround time to deploy a new NN.