Search by item HOME > Access full text > Search by item

JBE, vol. 28, no. 2, pp.230-237, March, 2023

DOI: https://doi.org/10.5909/JBE.2023.28.2.230

Deep Image Retrieval using Attention and Semantic Segmentation Map

Minjung Yoo, Eunhye Jo, Byoungjun Kim, and Sunok Kim

C.A E-mail: sunok.kim@kau.ac.kr

Abstract:
Self-driving is a key technology of the fourth industry and can be applied to various places such as cars, drones, cars, and
robots. Among them, localiztion is one of the key technologies for implementing autonomous driving as a technology that
identifies the location of objects or users using GPS, sensors, and maps. Locilization can be made using GPS or LIDAR, but it is
very expensive and heavy equipment must be mounted, and precise location estimation is difficult for places with radio
interference such as underground or tunnels. In this paper, to compensate for this, we proposes an image retrieval using attention
module and image segmentation maps using color images acquired with low-cost vision cameras as an input.


Keyword: Localization, Image Retrieval, Semantic Segmentation, Attention, Deep Learning

Reference:
[1] Hausler, S., Garg, S., Xu. M., Milford, M., & Fischer, T, “Patch-netvlad: Multi-scale fusion of locally-global descriptors for place recognition,“ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2021. doi: https://doi.org/10.1109/CVPR46437.2021.01392
[2] G. Tolias, S. Ronan, and J. Hervé, “Particular object retrieval with integral max-pooling of CNN activations,“ arXiv preprint, 2015. doi: https://doi.org/10.48550/arXiv.1511.05879
[3] A. Gordo, J. Almazan, J. Revaud, & D. Larlus, “End-to-end learning of deep visual representations for image retrieval,“ International Journal of Computer Vision, Vol. 124, No. 2, pp. 237-254, 2017. doi: https://doi.org/10.1007/s11263-017-1016-8
[4] J. G. Kwak, Y. Jin, Y. Li, D. Yoon, D. Kim, and H. Ko, “Adverse Weather Image Translation with Asymmetric and Uncertainty-aware GAN,“ arXiv preprint, 2021. doi: https://doi.org/10.48550/arXiv.2112.04283
[5] B. Cheng, D. C. Maxwell, Z. Yukun, L. Ting, S. H. Thomas, A. Hartwig, C. Liang, “Panoptic-deeplab: A simple, strong, and fast baseline for bottom-up panoptic segmentation,“ Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2020. https://doi.org/10.1109/CVPR42600.2020.01249
[6] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele, “The cityscapes dataset for semantic urban scene understanding,“ Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 2016. https://doi.org/10.1109/CVPR.2016.350
[7] S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, “Aggregated residual transformations for deep neural networks,“ Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 2017. doi: https://doi.org/10.1109/CVPR.2017.634
[8] S. Kim, S. Kim, D. Min, K. Sohn, “Laf-net: Locally adaptive fusion networks for stereo confidence estimation,“ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. doi: https://doi.org/10.1109/CVPR.2019.00029
[9] S. Woo, J. Park, JY. Lee, I.S. Kwoen, “Cbam: Convolutional block attention module,“ Proceedings of the European conference on computer vision (ECCV), 2018. doi: https://doi.org/10.1007/978-3-030-01234-2_1

Comment


Editorial Office
1108, New building, 22, Teheran-ro 7-gil, Gangnam-gu, Seoul, Korea
Homepage: www.kibme.org TEL: +82-2-568-3556 FAX: +82-2-568-3557
Copyrightⓒ 2012 The Korean Institute of Broadcast and Media Engineers
All Rights Reserved