PhD Defence by Hao Miao
Aspects of Deep Spatio-Temporal Analytics for Time Series, Streaming, and Trajectory Data
-
Aalborg University
Room 02.
13 Selma Lagerløfs Vej 300 - 6 December 2024, 14.00-17.00
- Language: English
-
Organizer:
Dept. of Computer Science
Aalborg University
Abstract
The widespread deployment of wireless and mobile devices results in a proliferation of spatio- temporal data (e.g., time series, streaming, and trajectory data) that is essential for applications, e.g., traffic and air quality prediction, where spatio-temporal analytics plays a key role in ensuring safety, predictability, and reliability. While recent studies have demonstrated superior performance in deep spatio-temporal analytics, many approaches struggle to adapt to real-world conditions. In particular, existing methods suffer from three limitations: 1) existing deep methods typically require large-scale training data, incurring high storage and computational costs; 2) when applied to streaming data, many models suffer from catastrophic forgetting, resulting in deteriorating prediction accuracy over time; and 3) existing solutions often assume centralized data, which leads to privacy concerns and fails to exploit decentralized data processing.
This Ph.D. study aims to systematically study deep spatio-temporal analytics with emerging techniques. Specifically, we target four types of functionality: time series dataset condensation, streaming spatio-temporal prediction, federated trajectory recovery, and federated trajectory similarity learning.
First, we address the problem of time series dataset condensation. The goal is to reduce training costs by summarizing large datasets into smaller, synthetic datasets that can then be used for training instead. We introduce TimeDC, an efficient time series dataset condensation framework that uses two-fold modal matching. TimeDC encompasses a time series feature extraction module for effective feature learning, a decomposition-driven frequency matching module for achieving temporal correlation preservation, and a curriculum- based trajectory matching module for ensuring that the synthetic datasets capture key patterns in the original dataset.
Second, we investigate spatio-temporal prediction on streaming data. We propose URCL, a unified replay-based streaming framework with three key modules: data integration, spatio-temporal continuous representation learning, and spatio-temporal prediction. Specifically, a spatio-temporal mixup mechanism is introduced to alleviate catastrophic forgetting, and a simple siamese network is designed to facilitate holistic feature learning.
Third, we study the problem of federated trajectory recovery, focusing on privacy preservation and enabling decentralized training. We propose LighTR+, a horizontal federated framework, which consists of a lightweight local trajectory embedding module, an intra-domain knowledge distillation module, and a meta-knowledge enhanced local-global training scheme. LighTR+ alleviates intra- and inter-domain interferences across clients while reducing communication costs between clients and the server, thereby facilitating privacy protection and improving efficiency.
Fourth, we explore federated trajectory similarity learning for decentralized data processing. We propose the F-TSL framework based on horizontal federated learning, a server-clients architecture. The framework includes a local trajectory preprocessing and learning module for clients, a privacy-preserving clustering module, and a server-side aggregation module, where the privacy-preserving clustering module leverages differential privacy to handle data heterogeneity across clients.