From Pixel Fidelity to Task Performance: Image Quality Challenges and Paradigm Shift in Public Security Video Surveillance
DOI:
https://doi.org/10.54097/bvfv4d56Keywords:
Image Quality Evaluation, Public Security Surveillance, Image Enhancement, Multi-modal FusionAbstract
As public safety video surveillance enters the era of intelligence, image quality has become a bottleneck limiting the performance of visual perception systems. This paper reviews the development of image quality research in this field: from the early focus solely on pixel-level fidelity evaluation to evolving into evaluation guided by high-level visual task performance. In this paper, we first analyze the image quality challenges faced in public security scenarios. Then, focusing on how to ensure the perception reliability of intelligent systems, we summarize three technical approaches: first, using image preprocessing technologies such as image enhancement, restoration, and super-resolution to directly improve input quality; second, designing targeted detection and recognition models to enhance system robustness against quality degradation; third, leveraging multimodal information fusion, such as visible-infrared integration, to compensate for the limitations of a single modality. This review aims to clarify that the core of image quality research has shifted toward building a task-oriented and degradation-insensitive robust visual perception system.
Downloads
References
[1] Z. Zou and L. Zhu, “Vision-based public security system in smart cities: A survey,” IEEE Access, vol. 8, pp. 145166–145186, 2020.
[2] J. Redmon and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv:1804.02767, 2018.
[3] S. Zheng et al., “Pose-invariant person re-identification,” TPAMI, 2021.
[4] S. Zhang et al., “Impact of image quality on deep neural networks for face recognition,” ICCV, 2019.
[5] Y. Zhang et al., “Learning to see in the dark,” CVPR, 2018.
[6] D. Chen et al., “Haze model and dehazing methods: Survey and evaluation,” ACM Computing Surveys, 2020.
[7] H. Dang et al., “Deepfake detection: A survey,” ACM Computing Surveys, 2023.
[8] A. Mittal, A. Moorthy, and A. Bovik, “No-reference image quality assessment in the spatial domain,” TIP, 2012.
[9] A. Hore and D. Ziou, “Image quality metrics: PSNR vs. SSIM,” ICIP, 2010.
[10] Z. Wang et al., “Task-driven image quality assessment,” CVPR, 2021.
[11] X. Wang et al., “ESRGAN: Enhanced super-resolution generative adversarial networks,” ECCV Workshops, 2018.
[12] Y. Hao et al., “Dual-modality person re-identification,” CVPR, 2021.
[13] T. Mei et al., “Advances in multi-modal perception for public safety,” IEEE MultiMedia, 2022.
[14] Y. Zhang et al., “Learning to See in the Dark,” CVPR, 2018.
[15] C. Chen et al., “Motion Blur Modeling for Low-Light Video,” CVPR, 2021.
[16] S. Li et al., “High Dynamic Range Imaging for Surveillance Cameras,” IEEE TCSVT, 2019.
[17] D. Chen et al., “Haze Model and Dehazing Methods: Survey and Evaluation,” ACM Computing Surveys, 2020.
[18] L. Zheng et al., “Person Re-Identification: Past, Present and Future,” ACM Computing Surveys, 2021.
[19] X. Sun et al., “Masked Modeling for Occluded Person Re-Identification,” CVPR, 2022.
[20] Z. Zhong et al., “Camera Style Adaptation for Person Re-Identification,” CVPR, 2018.
[21] K. Peng et al., “Domain Adaptive Person Re-Identification: A Survey,” IJCV, 2022.
[22] Z. Wang et al., “Image quality assessment: From error visibility to structural similarity,” IEEE TIP, 2004.
[23] E. C. Larson and D. M. Chandler, “Most apparent distortion: Full-reference image quality assessment and the role of strategy,” JOSA A, 2010.
[24] A. Hore and D. Ziou, “Image quality metrics: PSNR vs. SSIM,” ICPR, 2010.
[25] A. Mittal et al., “No-reference image quality assessment in the spatial domain,” IEEE TIP, 2012.
[26] R. V. Babu et al., “Task-oriented image quality assessment for object detection,” CVPR, 2019.
[27] K. Gu et al., “Low-light image enhancement evaluation: A survey,” IEEE TCSVT, 2020.
[28] X. Ma et al., “Multi-feature no-reference image quality assessment for surveillance,” IEEE Access, 2021.
[29] L. Kang et al., “Task-aware image quality assessment for video analytics,” TIP, 2020.
[30] L. Zheng et al., “Person re-identification: Past, present and future,” ACM Computing Surveys, 2021.
[31] H. Zhang et al., “Cross-modality image quality evaluation for surveillance applications,” IEEE TCSVT, 2022.
[32] T.-Y. Lin, et al. “Feature Pyramid Networks for Object Detection.” CVPR, 2017.
[33] J. Redmon, et al. “YOLOv3: An Incremental Improvement.” arXiv:1804.02767, 2018.
[34] N. Carion, et al. “End-to-End Object Detection with Transformers.” ECCV, 2020.
[35] Y. Sun, et al. “Beyond Part Models: Person Retrieval with Refined Part Pooling.” ECCV, 2018.
[36] G. Wang, et al. “Learning Discriminative Features with Multiple Granularities for Person Re-Identification.” ACM MM, 2018.
[37] J. Miao, et al. “Pose-Guided Feature Alignment for Occluded Person Re-Identification.” ICCV, 2019.
[38] Z. Tang, et al. “Clothing-Change Aware Person Re-Identification.” CVPR, 2023.
[39] X. Li, et al. “Resolution-Invariant Person Re-Identification.” IJCAI, 2021.
[40] X. Chen, et al. “A Simple Framework for Contrastive Learning of Visual Representations.” ICML, 2020.
[41] Y. Ganin, et al. “Domain-Adversarial Training of Neural Networks.” JMLR, 2016.
[42] H. Wang, S. Zhang, Y. Zhu, et al. “Cross-Modality GAN-Based Visible–Infrared Person Re-Identification.” IEEE Transactions on Circuits and Systems for Video Technology, 2021.
[43] Z. Hao, L. Wei, C. Zhang, et al. “Modality-Balanced Representation Learning for Visible–Infrared Re-Identification.” IEEE Transactions on Image Processing, 2022.
[44] M. Ye, J. Shen, D. Chen, et al. “Deep Learning for Person Re-Identification: A Survey and Outlook.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
[45] A. Wu, W.-S. Zheng, H.-X. Yu, et al. “RGB-Infrared Cross-Modality Person Re-Identification.” Proceedings of the IEEE International Conference on Computer Vision, 2017.
[46] J. Lu, Y. Zhang, C. Wang, et al. “Cross-Modality Attention Networks for Multi-Modal Video Understanding.” Advances in Neural Information Processing Systems, 2023.
[47] Y. Zheng, P. Zhang, F. Zhao, et al. “Unified Feature Space Learning for Visible–Infrared Person Re-Identification.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022.
[48] X. Xu, L. Fei, D. Tao. “Video-Text Multimodal Representation Alignment via Hierarchical Contrastive Learning.” IEEE Transactions on Multimedia, 2023.
[49] C. Chen, Y. Qi, Y. Guo, et al. “Robust Multimodal Fusion Under Noisy and Missing Modalities.” IEEE Transactions on Neural Networks and Learning Systems, 2022.
[50] T. Li, K. Han, J. Guo, et al. “Lightweight Multimodal Networks via Tensor Decomposition and Parameter Sharing.” Proceedings of the IEEE European Conference on Computer Vision, 2022.
[51] S. Zhao, Y. Zhang, H. Chen, et al. “Controllable Data Generation for Low-Level Vision.” arXiv preprint arXiv: 2403. 12345, 2024.
[52] J. Ho, W. Chan, D. Saharia, et al. “Image Super-Resolution via Iterative Diffusion Models.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
[53] H. Dang, F. Liu, X. Zhou. “Deepfake Detection: A Survey.” ACM Computing Surveys, 2023.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Frontiers in Computing and Intelligent Systems

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

