Crowdsourced Seafloor Mapping: Federated Learning at the Edge

The yellow areas show the extent of interpolated data in the Danish Depth Model v2. Less than 23% of the DDM v2 coverage is from modern multibeam og single beam surveys.

Blog post by: Anita Graser (AIT), Niels Bo Nielsen (DGA), Ove Andersen (DGA)

In Danish waters, nearly three quarters of the seafloor have never been mapped to modern standards. Worldwide, the situation is much the same: the most recent “Seabed 2030 Update” reports a global seafloor coverage of only 27,3%. Modern hydrographic surveys demand dedicated vessels sailing on dense track lines in controlled patterns. The resulting data is of high quality, but the process is costly and time-consuming, limiting the number of hydrographic surveys that can be completed each year. The result is a persistent shortfall, as hydrographic resources are concentrated on high-priority areas essential for navigational safety, leaving extensive regions unmapped or covered only by older surveys.

Further complicating matters, the seabed is far from static. Currents and waves continually reshape sandy or soft bottoms, while human activities, such as port expansions, channel dredging, and the construction of offshore wind farms, introduce further changes to the marine environment

Addressing the limitations of current survey capacity cannot be achieved by navy vessels alone. Instead, broader participation and innovation is required. DGA has therefore explored the possibility of leveraging crowdsourced bathymetry (CSB), where ordinary vessels contribute depth measurements during routine operations. This distributed approach offers the potential to increase coverage, keep pace with dynamic seabed changes, and gradually validating the oldest data. It can also serve as a decision-support tool to prioritise re-surveying efforts and, where data quality meets required standards e.g. IHO S-44, directly contribute to official charting.   

To address these challenges, Danish Geodata Agency (DGA) joined the European research project MobiSpaces, funded by the EU’s Horizon Europe programme. Over the last three years, MobiSpaces developed technologies for data governance, analytics, and edge computing across mobility domains, from urban traffic to the maritime environment. DGA collaborated with the Austrian Institute of Technology (AIT) in the “CrowdSeaMapping” use case, which investigates how depth data collected from ordinary vessels can be integrated into modern workflows by leveraging machine learning models to automatically identify errors and anomalies in crowdsourced depth data.

Leveraging the crowd

Perhaps the most widely recognized example of successful crowdsourcing is Wikipedia. In the geospatial domain, projects such as OpenStreetMap have mapped large parts of the world through voluntary contributions, sometimes surpassing the detail available in official national or commercial datasets.

A comparable approach has emerged in the marine sector, where the International Hydrographic Organization (IHO) defines it as Crowdsourced Bathymetry (CSB). Most modern vessels are already equipped with reliable echo sounders and GNSS systems, yet these measurements are typically used only in real-time navigation and are not stored. The main limitation is the absence of a dedicated system for capturing and transmitting this data. High-speed links such as 4G and 5G are confined to coastal waters, while satellite bandwidth remains costly. Consequently, any practical data collector must be capable of storing raw GNSS and echo sounder data onboard, with deferred transmission when affordable broadband connections become available. To ensure regulatory compliance, the data collector is geofenced to restrict collection to the Danish Exclusive Economic Zone (EEZ).

Proposed architecture

Crowdsourcing however, poses its own problems. Firstly, the data collection is not supervised by a qualified surveyor to ensure optimal system operation. Second, as the crowd grows so does the amount of data that needs to be transmitted and processed. Data cleaning is labour and time consuming. Therefore, DGA and AIT jointly experimented with using federated learning to handle the processing on the data collector itself. This not only reduces the amount of data that needs to be transmitted but also frees up resources as most of the data cleaning operations are handled directly at the source, i.e. the edge nodes of the federated learning system (Fig 1.).    


Fig 1: Federated learning data pipeline

Raw CSB data is prone to artefacts such as false bottoms from double returns, spikes from aeration or cavitation, offsets from unmodeled draught or tides, and occasional timing or sensor errors. Normally, manual data cleaning is necessary but – at this scale – it is infeasible, so the CrowdSeaMapping approach proposes that data collectors employ an onboard AI model, MapFed that allows for continuous modelling of the sea floor and the detection of anomalies.

MapFed learns the expected depth distribution for each location using an adaptive prototype approach. The model is initialized from an existing bathymetric grid – in this case, the Danish Depth Model – and can be continuously trained with incoming survey and/or CSB data. Measurements that deviate significantly from the learned distribution are flagged locally. Flagged points are then submitted for expert review. Hydrographers assess whether these represent true artefacts to be discarded or valid deviations that should be assimilated into updated bathymetric grids. This workflow creates a feedback loop: domain expertise continuously improves both the anomaly detection model and the underlying bathymetric reference.

Tests and Results

Several components of the proposed system were evaluated both at sea and in controlled environments. The data collector was installed aboard a vessel and underwent extensive field testing, demonstrating reliable performance under operational conditions. In parallel, the MapFed anomaly detection model and the federated learning setup were validated in a laboratory environment, confirming their technical feasibility and potential for reducing data storage and transfer requirements.

Although full integration of hardware and software components is still required before the system can be considered commercially deployable, the trials have clearly demonstrated the feasibility of the approach and its potential to deliver scalable crowdsourced bathymetry.

Hardware

The data collector was developed by the Danish company Sternula and consists of a hardware interface that can listen to a ship’s NMEA network. The NMEA network connects ships sensors such as echo sounder and GNSS. The data collector also has a Raspberry Pi compute module installed which is used to run the MapFed model (Fig 2.).

 
Fig 2: Egde device installed aboard Dana.

First deployment (2023)


Fig 3. The track line of the first sea trial. Background Danish Depth Model v2

The system was first tested in the summer of 2023 on board the research vessel Dana IV (operated by DTU Aqua). During this trial, the data collector operated continuously and without failure for 37 days (Fig 3.). The collected CSB data proved immediately useful in validating two incoming but conflicting surveys of Skagerrak. In addition, the results were later incorporated into the second version of the Danish Depth Model (DDM) which was released in August 2024, demonstrating that cleaned CSB data provided better input than interpolated estimates. While the coverage was limited – approximately 8,500 grid cells (50 × 50 m), corresponding to ~21 km² – this represented a significant first step for DGA.

Second deployment (2024–2025)


Fig 4: Track line from second sea trial in 2024-2025. Background Danish Depth Model V2 

A second, longer deployment took place between April 2024 and March 2025. Over this period, the system again proved robust and operating for nearly one year without intervention. During this test, depth data from large areas of the North Sea was collected which has potential to be used in future versions of the DDM.  

Data Assessment

To assess MapFed’s performance, data from this second deployment was manually cleaned by a hydrographic expert and compared against the model’s output. The comparison showed strong agreement between the MapFed model and expert judgment (on over 85% of the records) even though the model had to be trained on the public DDM with only 50 meter resolution and depth values of varying reliability (ranging from high-quality multibeam surveys to interpolated estimates from historical lead-line measurements. To avoid propagating uncertainty, interpolated values were excluded during training, which resulted in gaps in areas where no suitable reference data existed.) Unsurprisingly, MapFed struggled most in areas with lower DDM reliability since the absence of reliable training data led to misclassifications.

 

Comparison of DDM v1 and DDM v2 — incorporating Crowdsourced Bathymetry (CSB) data into the depth model. The track line shows the vessel’s survey path and highlights depth-band differences and model alignment.

Since no comprehensive ground truth dataset exists for Danish waters, validation necessarily depends on hydrographic experts, who review CSB records in the context of additional information beyond the public DDM. Their feedback not only provides a benchmark for assessing anomaly detection but also serves as input to refine both the bathymetric grids and the MapFed model itself. In the long term, these gaps can be filled by incorporating new CSB contributions, and through federated learning the model can be continuously improved as more vessels participate.

Conclusion

The first tests of the dedicated datalogger for data collection and the MapFed machine learning model for data analysis demonstrate that crowdsourced bathymetry is not only technically feasible but also operationally valuable. Even limited deployments have already shown how CSB can validate existing surveys, improve national bathymetric models, and highlight discrepancies that would otherwise remain hidden. Between the two real-world tests carried out in the CrowdSeaMapping use case, more than 5.7 million depth points were collected, enough to cover approximately 4.3% of the 50 x 50-meter cells in the DDM. This clearly demonstrates that – at sufficient scale – CSB can make a measurable contribution to national bathymetric mapping.

If participation expands to include merchant vessels, fishing boats, and larger yachts, CSB could provide a continuous stream of depth data that keeps pace with the dynamic seafloor. Combined with federated learning and edge processing, this creates a sustainable model for large-scale, data integration. Rather than replacing national hydrographic surveys, CSB augments them, serving both as a decision-support tool for prioritizing re-surveying and – in the future if quality thresholds are met – could perhaps serve as an input to nautical charts.

With sufficient adoption and coordination, CSB has the potential to significantly accelerate progress toward global initiatives such as Seabed 2030, while also giving hydrographic offices a practical way to keep bathymetric reference data up to date. It represents not just a method to close mapping gaps, but a shift toward a more dynamic, participatory, and data-rich approach to understanding and managing the marine environment.

This work was carried out as part of the EU Horizon Europe project MobiSpaces (Grant Agreement No. 101070279).

Contact

Danish Hydrographic Office

Email: soe_policy@gst.dk