Multi Sensor Pedestrian Re Identification with Uncertainty Guided Fusion of Vision Language and LiDAR

Carlos Martínez; Lucía Fernández; Javier Ruiz

doi:10.71465/ajeet3573

Vol. 7 No. 1 (2026), Articles

Vol. 7 No. 1 (2026)

Multi Sensor Pedestrian Re Identification with Uncertainty Guided Fusion of Vision Language and LiDAR

Articles

Published 2026-01-31

Carlos Martínez⁺⁻
Lucía Fernández⁺⁻
Javier Ruiz ⁺⁻

https://doi.org/10.71465/ajeet3573

Carlos Martínez

Department of Computer Science, University of Barcelona, 08007 Barcelona, Spain

Lucía Fernández

Department of Computer Science, University of Barcelona, 08007 Barcelona, Spain

Javier Ruiz

Department of Computer Science, University of Barcelona, 08007 Barcelona, Spain

pdf

Keywords

Multi-sensor fusion
LiDAR
pedestrian re-identification
uncertainty gating
autonomous driving

Abstract

In autonomous driving, pedestrian appearance cues from cameras can be unreliable under glare, fog, or low illumination, while LiDAR offers complementary geometry signals. Building on uncertainty-aware CLIP-based modal modeling, this work introduces a fusion framework that integrates vision–language embeddings with LiDAR-derived shape descriptors using uncertainty gated feature mixing. The gating module increases reliance on LiDAR when visual uncertainty rises and maintains vision–language dominance in clear conditions. Experiments are conducted on multi sensor datasets covering 210,000 paired camera–LiDAR observations and 26,000 identities. Baselines include camera-only ReID (OSNet, TransReID), CLIP-based ReID, and naive concatenation fusion. The proposed method improves overall mAP by 3.3%–4.8% and yields larger gains of 6.0%–7.5% in low-light subsets, while adding less than 8% inference latency compared with camera-only models.

pdf

This work is licensed under a Creative Commons Attribution 4.0 International License.