Abstract
The accurate estimation of three-dimensional human poses in dynamic, multi-agent sports environments remains a significant challenge in computer vision. Basketball, characterized by rapid player movements, frequent inter-player contact, and severe dynamic occlusion, presents a uniquely difficult scenario for standard pose estimation frameworks. This paper introduces a novel methodology that integrates temporal kinematic constraints with deep learning-based pose estimation to mitigate the degradation of accuracy during occlusion events. By modeling the biomechanical limitations of human joint articulation and imposing temporal consistency across video frames, our approach reconstructs missing or noisy keypoint data with high fidelity. We employ a two-stage architecture that first lifts two-dimensional detections to a threedimensional space and subsequently refines these estimates using a temporal optimization modulegroundedinNewtonianmechanics.Experimental results on varying basketball datasets demonstrate that this method significantly reduces the Mean Per Joint Position Error (MPJPE) compared to state-of-the-art baseline methods,particularly in scenarios involving heavy defensive congestion. The integration of kinematic velocity and acceleration constraints ensures that the generated poses are not only visually plausible but physically valid.
References
Ahmad, N. R. (2025). Financial inclusion: How digital banking is bridging the gap for emerging markets.
Ahmad, N. R. (2025). Exploring the relationship between leadership styles and employee motivation in remote work environments.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2026 James Anderson, Robert Miller (Author)