UPDF AI

Causal Knowledge in Data Fusion Subject to Latent Confounding and Measurement Error

Jingyi Yu,Tim Pychynski,Marco F. Huber

2024 · DOI: 10.1109/MFI62651.2024.10705789
International Conference on Multisensor Fusion and Integration for Intelligent Systems · 0 Citations

TLDR

It is shown that the machine learning-based fusion strategy achieves the best prediction quality when data are independent and identically distributed, but in the presence of latent confounding, the causality-based fusion strategy makes prediction models more robust against severe distribution shifts.

Abstract

Data fusion is the process of integrating data from multiple sources to produce more accurate and reliable information. It is often the case that data are subject to latent confounding and measurement error in real-world scenarios. In this paper, we evaluate fusion strategies based on different levels of contained causal knowledge to solve quality prediction under varied conditions of latent confounding and measurement error. We show that the machine learning-based fusion strategy achieves the best prediction quality when data are independent and identically distributed (i.i.d.). However, in the presence of latent confounding, the causality-based fusion strategy makes prediction models more robust against severe distribution shifts. Moreover, the out-of-distribution (OOD) generalizability of prediction models is also affected by measurement error in the data. If causal knowledge needs to be inferred from data by applying causal discovery methods, we demonstrate that measurement error can adversely impair causal discovery. We advocate that caution needs to be exercised when using standard causal discovery methods if the circumstances under which the data were generated are unknown.

Cited Papers
Citing Papers