Improving Synthetic 3D Model-Aided Indoor Image Localization via Domain Adaptation


Although the deep learning-based indoor image localization has made significant improvement in terms of accuracy, efficiency, and storage requirement of large indoor scenes, the need for collecting huge labeled training data severely limits its practical application. Recently, the synthetic images rendered from widely available 3D models have shown promising potential to relieve the data collection problem. However, due to the dramatic differences between the synthetic and real images, the localization accuracy of approaches trained on synthetic images is not comparable to the methods trained on real images. In this paper, we propose a domain adaptation-based approach to address this issue. Specifically, the proposed approach mainly contains a model consisting of a multi-level constrained pose regression network and a feature-level discriminator network. The discriminator network forces the pose regression network to generate domain-invariant features from real and synthetic images by adversarial learning and thus reduces the performance gaps. In addition, the multi-level constraints further enhance the localization accuracy of pose regression. We perform extensive experiments on open-source rendering images in different settings. The results show that the proposed method significantly improves the performance. The code for the proposed work is available at

ISPRS Journal of Photogrammetry and Remote Sensing