Image Captioning in News Report Scenario


  • Tianrui Liu
  • Qi Cai
  • Changxin Xu
  • Bo Hong
  • Jize Xiong
  • Yuxin Qiao
  • Tsungwei Yang



Image captioning; Computer vision; Natural language processing; Content generation.


 Image captioning[1] strives to generate pertinent captions for specified images[56], situating itself at the crossroads of Computer Vision (CV) [2] and Natural Language Processing (NLP)[54]. This endeavor is of paramount importance with far-reaching applications in recommendation systems, news outlets, social media, and beyond. Particularly within the realm of news reporting, captions are expected to encompass detailed information, such as the identities of celebrities captured in the images. However, much of the existing body of work primarily centers around understanding scenes and actions.[3] In this paper, we explore the realm of image captioning specifically tailored for celebrity photographs, illustrating its broad potential for enhancing news industry practices. This exploration aims to augment automated news content generation, thereby facilitating a more nuanced dissemination of information.[57] Our endeavor shows a broader horizon, enriching the narrative in news reporting through a more intuitive image captioning framework.


Download data is not yet available.


Vinyals, Oriol, et al. "Show and tell: A neural image caption generator." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.

Ke, Lei, et al. "Reflective decoding network for image captioning." Proceedings of the IEEE/CVF international conference on computer vision. 2019.

Cao, Qiong, et al. "Vggface2: A dataset for recognising faces across pose and age." 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, 2018.

Zhang, Kaipeng, et al. "Joint face detection and alignment using multitask cascaded convolutional networks." IEEE signal processing letters 23.10 (2016): 1499-1503.

Liu, Tianrui, et al. "News recommendation with attention mechanism." arXiv preprint arXiv:2402.07422 (2024).


Anderson, Peter, et al. "Bottom-up and top-down attention for image captioning and visual question answering." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.


Liu, Tianrui, et al. "News Recommendation with Attention Mechanism." Journal of Industrial Engineering and Applied Science 2.1 (2024): 21-26.

Radford, Alec, et al. "Learning transferable visual models from natural language supervision." International conference on machine learning. PMLR, 2021

Li, Yanjie, et al. "Transfer-learning-based network traffic automatic generation framework." 2021 6th International Conference on Intelligent Computing and Signal Processing (ICSP). IEEE, 2021

Liu, Wei, et al. "Cptr: Full transformer network for image captioning." arXiv preprint arXiv:2101.10804 (2021).

Liu, Tianrui, et al. "Particle Filter SLAM for Vehicle Localization." Journal of Industrial Engineering and Applied Science 2.1 (2024): 27-31.

Zhao, Zhiming, et al. "Enhancing E-commerce Recommendations: Unveiling Insights from Customer Reviews with BERTFusionDNN." Journal of Theory and Practice of Engineering Science 4.02 (2024): 38-44.

Su, Jing, et al. "Large Language Models for Forecasting and Anomaly Detection: A Systematic Literature Review." arXiv preprint arXiv:2402.10350 (2024).

Xiong, Jize, et al. "Decoding sentiments: Enhancing covid-19 tweet analysis through bert-rcnn fusion." Journal of Theory and Practice of Engineering Science 4.01 (2024): 86-93.

Liu, Shun, et al. "Financial time-series forecasting: Towards synergizing performance and interpretability within a hybrid machine learning approach." arXiv preprint arXiv:2401.00534 (2023).

Popokh, Leo, et al. "IllumiCore: Optimization Modeling and Implementation for Efficient VNF Placement." 2021 International Conference on Software, Telecommunications and Computer Networks (SoftCOM). IEEE, 2021.

Su, Jing, Suku Nair, and Leo Popokh. "EdgeGYM: a reinforcement learning environment for constraint-aware NFV resource allocation." 2023 IEEE 2nd International Conference on AI in Cybersecurity (ICAIC). IEEE, 2023.

Su, Jing, Suku Nair, and Leo Popokh. "Optimal resource allocation in sdn/nfv-enabled networks via deep reinforcement learning." 2022 IEEE Ninth International Conference on Communications and Networking (ComNet). IEEE, 2022.

Fu, Zhe, Xi Niu, and Li Yu. "Wisdom of crowds and fine-grained learning for serendipity recommendations." Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2023.

Wei, Xiaojun, et al. "Narrowing Signal Distribution by Adamantane Derivatization for Amino Acid Identification Using an α-Hemolysin Nanopore." Nano Letters (2024).

Song, Ge, et al. "Energy consumption auditing based on a generative adversarial network for anomaly detection of robotic manipulators." Future Generation Computer Systems 149 (2023): 376-389.

Ou, Junlin, et al. "Hybrid path planning based on adaptive visibility graph initialization and edge computing for mobile robots." Engineering Applications of Artificial Intelligence 126 (2023): 107110.

Zhou, Yucheng, et al. "Visual In-Context Learning for Large Vision-Language Models." arXiv preprint arXiv: 2402.11574 (2024).

Zhou, Yucheng, et al. "Thread of thought unraveling chaotic contexts." arXiv preprint arXiv:2311.08734 (2023).

Zhou, Yucheng, et al. "Claret: Pre-training a correlation-aware context-to-event transformer for event-centric generation and classification." arXiv preprint arXiv:2203.02225 (2022).

Tian, Jiwei, et al. "LESSON: Multi-Label Adversarial False Data Injection Attack for Deep Learning Locational Detection." IEEE Transactions on Dependable and Secure Computing (2024).

Tian, Jiwei, et al. "Adversarial attacks and defenses for deep-learning-based unmanned aerial vehicles." IEEE Internet of Things Journal 9.22 (2021): 22399-22409.

Wu, Jing, et al. "SwitchTab: Switched Autoencoders Are Effective Tabular Learners." arXiv preprint arXiv:2401.02013 (2024).

Chen, Suiyao, et al. "Recontab: Regularized contrastive representation learning for tabular data." arXiv preprint arXiv:2310.18541 (2023).

Chen, Suiyao, et al. "Claims data-driven modeling of hospital time-to-readmission risk with latent heterogeneity." Health care management science 22 (2019): 156-179.

Chen, Suiyao, et al. "A data heterogeneity modeling and quantification approach for field pre-assessment of chloride-induced corrosion in aging infrastructures." Reliability Engineering & System Safety 171 (2018): 123-135.

Hsieh, Yung-Ting, Khizar Anjum, and Dario Pompili. "Ultra-low Power Analog Recurrent Neural Network Design Approximation for Wireless Health Monitoring." 2022 IEEE 19th International Conference on Mobile Ad Hoc and Smart Systems (MASS). IEEE, 2022.

Hsieh, Yung-Ting, Zhuoran Qi, and Dario Pompili. "ML-based joint Doppler estimation and compensation in underwater acoustic communications." Proceedings of the 16th International Conference on Underwater Networks & Systems. 2022.

Sun, Chuanneng, et al. "Fed2kd: Heterogeneous federated learning for pandemic risk assessment via two-way knowledge distillation." 2022 17th Wireless On-Demand Network Systems and Services Conference (WONS). IEEE, 2022.

Sun, Chuanneng, Songjun Huang, and Dario Pompili. "HMAAC: Hierarchical Multi-Agent Actor-Critic for Aerial Search with Explicit Coordination Modeling." 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023.

Sun, Chuanneng, et al. "Contextual Biasing of Named-Entities with Large Language Models." arXiv preprint arXiv: 2309.00723 (2023).

Younis, Ayman, Chuanneng Sun, and Dario Pompili. "Communication-efficient Federated Learning Design with Fronthaul Awareness in NG-RANs." 2022 IEEE 19th International Conference on Mobile Ad Hoc and Smart Systems (MASS). IEEE, 2022.

Flynn, Patrick, Xinyao Yi, and Yonghong Yan. "Exploring source-to-source compiler transformation of OpenMP SIMD constructs for Intel AVX and Arm SVE vector architectures." Proceedings of the Thirteenth International Workshop on Programming Models and Applications for Multicores and Manycores. 2022.

Yi, Xinyao, et al. "CUDAMicroBench: Microbenchmarks to Assist CUDA Performance Programming." 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2021.

Li, Xiaying, Belle Li, and Su-Je Cho. "Empowering Chinese Language Learners from Low-Income Families to Improve Their Chinese Writing with ChatGPT’s Assistance Afterschool." Languages 8.4 (2023): 238.

Xie, Ying, et al. "Advancing Legal Citation Text Classification A Conv1D-Based Approach for Multi-Class Classification." Journal of Theory and Practice of Engineering Science 4.02 (2024): 15-22.

Luo, Yang, et al. "Enhancing E-commerce Chatbots with Falcon-7B and 16-bit Full Quantization." Journal of Theory and Practice of Engineering Science 4.02 (2024): 52-57.

Pan, Zhenyu, et al. "Ising-traffic: Using ising machine learning to predict traffic congestion under uncertainty." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 37. No. 8. 2023.

Pan, Zhenyu, et al. "CoRMF: Criticality-Ordered Recurrent Mean Field Ising Solver." arXiv preprint arXiv:2403.03391 (2024).

Hu, Zhirui, et al. "On the design of quantum graph convolutional neural network in the nisq-era and beyond." 2022 IEEE 40th International Conference on Computer Design (ICCD). IEEE, 2022.

Chen, Yinda, et al. "Self-supervised neuron segmentation with multi-agent reinforcement learning." arXiv preprint arXiv:2310.04148 (2023).

Chen, Yinda, et al. "Learning multiscale consistency for self-supervised electron microscopy instance segmentation." arXiv preprint arXiv:2308.09917 (2023).

Chen, Yinda, et al. "Generative text-guided 3d vision-language pretraining for unified medical image segmentation." arXiv preprint arXiv:2306.04811 (2023).

Tong, Xin, et al. "A Deep‐Learning Approach for Low‐Spatial‐Coherence Imaging in Computer‐Generated Holography." Advanced Photonics Research 4.1 (2023): 2200264.

Xu, Renjun, et al. "$ E (2) $-Equivariant Vision Transformer." Uncertainty in Artificial Intelligence. PMLR, 2023.

Gao, Shangde, et al. "Contrastive Knowledge Amalgamation for Unsupervised Image Classification." International Conference on Artificial Neural Networks. Cham: Springer Nature Switzerland, 2023.

Shangguan, Zhongkai, Zihe Zheng, and Lei Lin. "Trend and thoughts: Understanding climate change concern using machine learning and social media data." arXiv preprint arXiv:2111.14929 (2021).

Shangguan, Zhongkai, et al. "Neural process for black-box model optimization under bayesian framework." arXiv preprint arXiv:2104.02487 (2021).

Zang, Hengyi. "Precision calibration of industrial 3d scanners: An ai-enhanced approach for improved measurement accuracy." Global Academic Frontiers 2.1 (2024): 27-37.

Wang, Yishuang, et al. "Predicting Nonlinear Structural Dynamic Response of ODE Systems Using Constrained Gaussian Process Regression." Society for Experimental Mechanics Annual Conference and Exposition. Cham: Springer Nature Switzerland, 2023.

Platz, Roland, Xinyue Xu, and Sez Atamturktur. "Introducing a Round-Robin Challenge to Quantify Model Form Uncertainty in Passive and Active Vibration Isolation." Society for Experimental Mechanics Annual Conference and Exposition. Cham: Springer Nature Switzerland, 2023.

Li, Zhenglin, et al. "Comprehensive evaluation of Mal-API-2019 dataset by machine learning in malware detection." International Journal of Computer Science and Information Technology 2.1 (2024): 1-9.







How to Cite

Image Captioning in News Report Scenario. (2024). Academic Journal of Science and Technology, 10(1), 284-289.

Similar Articles

1-10 of 381

You may also start an advanced similarity search for this article.