DIPAAL: DIstributed PostgreSQL-based AIS Analytics and Loading - PhDData

Access database of worldwide thesis




DIPAAL: DIstributed PostgreSQL-based AIS Analytics and Loading

The thesis was published by Mikkelsen, Mikael Vind, in January 2023, Aalborg University.

Abstract:

AIS data show promise for analytical purposes, butas the data are not intended for analysis, the data need to becleaned, processed, and stored before being usable. This paperpresents an extension of DIPAAL, a system consisting of anefficient and modular ETL process for loading AIS data, aswell as a distributed data warehouse storing the trajectories ofships. A spatially distributed data warehouse, with granularizedcell and heatmap representations, is designed, developed, andevaluated. At the time of writing, DIPAAL stores 414 millionkilometres of ship trajectories and more than 10 billion rows inthe largest relation. It is found that the introduced granularizedcell representation resolved out-of-memory errors of previouswork, while improving the runtime of up to 324% comparedto a trajectory-based query. It is also found that the spatiallydivided shards enable a consistently good scale up for both celland heatmap analytics in large areas, ranging between 354% to1164% with a 5x increase in workers. Lastly, it is found thatthe spatial divisions become slightly skewed over time, as trafficpatterns evolve.



Read the last PhD tips