Search by item HOME > Access full text > Search by item

JBE, vol. 28, no. 2, pp.178-184, March, 2023

DOI: https://doi.org/10.5909/JBE.2023.28.2.178

Korean Text Image Super-Resolution for Improving Text Recognition

Junhyeong Kwon and Nam Ik Cho

C.A E-mail: nicho@snu.ac.kr

Abstract:
Finding texts in general scene images and recognizing their contents is a very important task that can be used as a basis for
robot vision, visual assistance, and so on. However, for the low-resolution text images, the degradations, such as noise or blur
included in text images, are more noticeable, which leads to severe performance degradation of text recognition accuracy. In this
paper, we propose a new Korean text image super-resolution based on a Transformer-based model, which generally shows higher
performance than convolutional neural networks. In the experiments, we show that text recognition accuracy for Korean text images
can be improved when our proposed text image super-resolution method is used. We also propose a new Korean text image
dataset for training our model, which contains massive HR-LR Korean text image pairs.


Keyword: Scene Text Image Super-Resolution, Korean Text, Transformer

Reference:
[1] B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, “Enhanced Deep Residual Networks for Single Image Super-Resolution,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136-144, 2017. doi: https://doi.org/10.1109/CVPRW.2017.151
[2] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual Dense Network for Image Super-Resolution,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472-2481, 2018. doi: https://doi.org/10.1109/CVPR.2018.00262
[3] J. Ma, Z. Liang, and L. Zhang, “A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-Resolution,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5911-5920, 2022. doi: https://doi.org/10.1109/CVPR52688.2022.00582
[4] A. Vaswani et al. “Attention is All You Need,” Advances in Neural Information Processing Systems, 30, 2017.
[5] B. Shi, X. Bai, and C. Yao, “An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition,” IEEE transactions on pattern analysis and machine intelligence, Vol.39, No.11, pp.2298-2304, 2016. doi: https://doi.org/10.1109/TPAMI.2016.2646371
[6] W. Wang, E. Xie, X. Liu, W. Wang, D. Liang, C. Shen, and X. Bai, “Scene Text Image Super-Resolution in the Wild,” European Conference on Computer Vision, Springer, Cham, 2020. doi: https://doi.org/10.1007/978-3-030-58607-2_38
[7] T. Zheng, Z. Chen, S. Fang, H. Xie, and Y. G. Jiang, “Cdistnet: Perceiving Multi-Domain Character Distance for Robust Text Recognition,” arXiv preprint arXiv:2111.11011, 2021. doi: https://doi.org/10.48550/arXiv.2111.11011
[8] AI Hub, https://aihub.or.kr (accessed Dec. 28, 2022.)
[9] Outdoor images including Korean texts, https://aihub.or.kr/aihubdata/ data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm& dataSetSn=105 (accessed Dec. 28, 2022.)
[10] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Aitken, A. Tejani, and W. Shi, “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4681-4690, 2017. doi: https://doi.org/10.1109/CVPR.2017.19

Comment


Editorial Office
1108, New building, 22, Teheran-ro 7-gil, Gangnam-gu, Seoul, Korea
Homepage: www.kibme.org TEL: +82-2-568-3556 FAX: +82-2-568-3557
Copyrightⓒ 2012 The Korean Institute of Broadcast and Media Engineers
All Rights Reserved