Empirical Study and Mitigation Methods of Bias in LLM-Based Robots

Ren Zhou

doi:10.54097/re9qp070

Authors

Ren Zhou

DOI:

https://doi.org/10.54097/re9qp070

Keywords:

Large Language Models (LLMs); Bias Detection; Bias Mitigation; Customer Service Robots; Education Robots; Healthcare Robots; Recruitment Robots; Social Robots; Human-Robot Interaction; Fairness; Inclusivity.

Abstract

Our study provides a comprehensive analysis of biased behaviors exhibited by robots utilizing large language models (LLMs) in real-world applications, focusing on five experimental scenarios: customer service, education, healthcare, recruitment, and social interaction. The analysis reveals significant differences in user experiences based on race, health status, work experience, and social status. For instance, the average satisfaction score for white customers is 4.2, compared to 3.5 for black customers, and the response accuracy for white students is 92%, versus 85% for black students. To address these biases, we propose several mitigation methods, including data resampling, model regularization, post-processing techniques, diversity assessment, and user feedback mechanisms. These methods aim to enhance the fairness and inclusivity of robotic systems, promoting healthy human-robot interactions. By combining our quantitative data analysis with existing research, we affirm the importance of bias detection and mitigation, and propose various improvement strategies. Future research should further explore data balancing strategies, fairness-constrained models, real-time monitoring and adjustment mechanisms, and cross-domain studies to comprehensively evaluate and improve the performance of LLM-based robotic systems across various tasks.

Downloads

Download data is not yet available.

References

An, H., Acquaye, C., Wang, C., Li, Z., & Rudinger, R. (2024). Do Large Language Models Discriminate in Hiring Decisions on the Basis of Race, Ethnicity, and Gender?. arXiv preprint arXiv:2406.10486.

Lin, Y. (2024). Application and Challenges of Computer Networks in Distance Education. Computing, Performance and Communication Systems, 8(1), 17-24. DOI: https://doi.org/10.23977/cpcs.2024.080103

Lin, Y. (2024). Design of urban road fault detection system based on artificial neural network and deep learning. Frontiers in neuroscience, 18, 1369832. DOI: https://doi.org/10.3389/fnins.2024.1369832

Caliskan, A., Ajay, P. P., Charlesworth, T., Wolfe, R., & Banaji, M. R. (2022, July). Gender bias in word embeddings: A comprehensive analysis of frequency, syntax, and semantics. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (pp. 156-170). DOI: https://doi.org/10.1145/3514094.3534162

Haber, E. (2021). Racial recognition. CARDozo L. REv., 43, 71. DOI: https://doi.org/10.4324/9780203760291-2

Yang, Y., Guo, Z., Gellman, A. J., & Kitchin, J. (2022, November). Modeling Ternary Alloy Segregation with Density Functional Theory and Machine Learning. In 2022 AIChE Annual Meeting. AIChE.

Yang, Y., Liu, M., & Kitchin, J. R. (2022). Neural network embeddings based similarity search method for atomistic systems. Digital Discovery, 1(5), 636-644. DOI: https://doi.org/10.1039/D2DD00055E

Yang, Y., Achar, S. K., & Kitchin, J. R. (2022). Evaluation of the degree of rate control via automatic differentiation. AIChE Journal, 68(6), e17653. DOI: https://doi.org/10.1002/aic.17653

Yang, Y., Guo, Z., Gellman, A. J., & Kitchin, J. R. (2022). Simulating segregation in a ternary Cu–Pd–Au alloy with density functional theory, machine learning, and Monte Carlo simulations. The Journal of Physical Chemistry C, 126(4), 1800-1808. DOI: https://doi.org/10.1021/acs.jpcc.1c09647

Gallegos, I. O., Rossi, R. A., Barrow, J., Tanjim, M. M., Kim, S., Dernoncourt, F., ... & Ahmed, N. K. (2024). Bias and fairness in large language models: A survey. Computational Linguistics, 1-79. DOI: https://doi.org/10.1162/coli_a_00524

Yang, J. (2024). Data-Driven Investment Strategies in International Real Estate Markets: A Predictive Analytics Approach. International Journal of Computer Science and Information Technology, 3(1), 247-258. DOI: https://doi.org/10.62051/ijcsit.v3n1.32

Yang, J. (2024). Comparative Analysis of the Impact of Advanced Information Technologies on the International Real Estate Market. Transactions on Economics, Business and Management Research, 7, 102-108. DOI: https://doi.org/10.62051/cx32zy09

Yang, J. (2024). Application of Business Information Management in Cross-border Real Estate Project Management. International Journal of Social Sciences and Public Administration, 3(2), 204-213. DOI: https://doi.org/10.62051/ijsspa.v3n2.24

Acconito, C., Angioletti, L., & Balconi, M. (2024). Can Professionals Resist Cognitive Bias Elicited by the Visual System? Reversed Semantic Prime Effect and Decision Making in the Workplace: Reaction Times and Accuracy. Sensors, 24(12), 3999. DOI: https://doi.org/10.3390/s24123999

Shah, M., & Sureja, N. (2024). A Comprehensive Review of Bias in Deep Learning Models: Methods, Impacts, and Future Directions. Archives of Computational Methods in Engineering, 1-13. DOI: https://doi.org/10.1007/s11831-024-10134-2

Wang, J., Li, X., Jin, Y., Zhong, Y., Zhang, K., & Zhou, C. (2024). Research on image recognition technology based on multimodal deep learning. arXiv preprint arXiv:2405.03091.

Wang, J., Zhang, H., Zhong, Y., Liang, Y., Ji, R., & Cang, Y. (2024, May). Advanced Multimodal Deep Learning Architecture for Image-Text Matching. In 2024 IEEE 4th International Conference on Electronic Technology, Communication and Information (ICETCI) (pp. 1185-1191). IEEE. DOI: https://doi.org/10.1109/ICETCI61221.2024.10594167

Abdullah, N. A., Feizollah, A., Sulaiman, A., & Anuar, N. B. (2019). Challenges and recommended solutions in multi-source and multi-domain sentiment analysis. IEEE Access, 7, 144957-144971. DOI: https://doi.org/10.1109/ACCESS.2019.2945340

Wang, C., Yang, H., Chen, Y., Sun, L., Wang, H., & Zhou, Y. (2012). Identification of Image-spam Based on Perimetric Complexity Analysis and SIFT Image Matching Algorithm. JOURNAL OF INFORMATION &COMPUTATIONAL SCIENCE, 9(4), 1073-1081.

Zhang, Y., Li, S., Deng, C., Wang, L., & Zhao, H. (2024). Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks. arXiv preprint arXiv:2405.16860. DOI: https://doi.org/10.18653/v1/2024.naacl-long.44

Shi, Y., Ma, C., Wang, C., Wu, T., & Jiang, X. (2024, May). Harmonizing Emotions: An AI-Driven Sound Therapy System Design for Enhancing Mental Health of Older Adults. In International Conference on Human-Computer Interaction (pp. 439-455). Cham: Springer Nature Switzerland. DOI: https://doi.org/10.1007/978-3-031-60615-1_30

Schwartz, R., Schwartz, R., Vassilev, A., Greene, K., Perine, L., Burt, A., & Hall, P. (2022). Towards a standard for identifying and managing bias in artificial intelligence (Vol. 3, p. 00). US Department of Commerce, National Institute of Standards and Technology. DOI: https://doi.org/10.6028/NIST.SP.1270

Yao, Y. (2022). A Review of the Comprehensive Application of Big Data, Artificial Intelligence, and Internet of Things Technologies in Smart Cities. Journal of Computational Methods in Engineering Applications, 1-10. DOI: https://doi.org/10.62836/jcmea.v2i1.0004

Pena, A., Serna, I., Morales, A., & Fierrez, J. (2020). Bias in multimodal AI: Testbed for fair automatic recruitment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 28-29). DOI: https://doi.org/10.1109/CVPRW50498.2020.00022

Limantė, A. (2024). Bias in Facial Recognition Technologies Used by Law Enforcement: Understanding the Causes and Searching for a Way Out. Nordic Journal of Human Rights, 42(2), 115-134. DOI: https://doi.org/10.1080/18918131.2023.2277581

Sun, L. (2024). Securing supply chains in open source ecosystems: Methodologies for determining version numbers of components without package management files. Journal of Computing and Electronic Information Management, 12(1), 32-36. DOI: https://doi.org/10.54097/n8djwto1zb

Soana, V., Shi, Y., & Lin, T. A Mobile, Shape-Changing Architectural System: Robotically-Actuated Bending-Active Tensile Hybrid Modules.

Thach, H., Mayworm, S., Delmonaco, D., & Haimson, O. (2024). (In) visible moderation: A digital ethnography of marginalized users and content moderation on Twitch and Reddit. New Media & Society, 26(7), 4034-4055. DOI: https://doi.org/10.1177/14614448221109804

Zhong, Y., Liu, Y., Gao, E., Wei, C., Wang, Z., & Yan, C. (2024). Deep Learning Solutions for Pneumonia Detection: Performance Comparison of Custom and Transfer Learning Models. medRxiv, 2024-06. DOI: https://doi.org/10.1101/2024.06.20.24309243

Gebru, T. (2020). Race and gender. The Oxford handbook of ethics of AI, 251-269. DOI: https://doi.org/10.1093/oxfordhb/9780190067397.013.16

Seo, S. (2022). When female (male) robot is talking to me: effect of service robots’ gender and anthropomorphism on customer satisfaction. International Journal of Hospitality Management, 102, 103166. DOI: https://doi.org/10.1016/j.ijhm.2022.103166

An, L., Song, C., Zhang, Q., & Wei, X. (2024). Methods for assessing spillover effects between concurrent green initiatives. MethodsX, 12, 102672. DOI: https://doi.org/10.1016/j.mex.2024.102672

Kriebitz, A., Max, R., & Lütge, C. (2022). The German Act on Autonomous Driving: why ethics still matters. Philosophy & technology, 35(2), 29. DOI: https://doi.org/10.1007/s13347-022-00526-2

Shih, H. C., Wei, X., An, L., Weeks, J., & Stow, D. (2024). Urban and Rural BMI Trajectories in Southeastern Ghana: A Space-Time Modeling Perspective on Spatial Autocorrelation. International Journal of Geospatial and Environmental Research, 11(1), 3.

Yao, Y. (2024). Application of Artificial Intelligence in Smart Cities: Current Status, Challenges and Future Trends. International Journal of Computer Science and Information Technology, 2(2), 324-333.

Yao, Y. (2024). Digital Government Information Platform Construction: Technology, Challenges and Prospects. International Journal of Social Sciences and Public Administration, 2(3), 48-56. DOI: https://doi.org/10.62051/ijsspa.v2n3.06

Lian, J., & Chen, T. (2024). Research on Complex Data Mining Analysis and Pattern Recognition Based on Deep Learning. Journal of Computing and Electronic Information Management, 12(3), 37-41. DOI: https://doi.org/10.54097/i4jfi9aa

Chen, T., Lian, J., & Sun, B. (2024). An Exploration of the Development of Computerized Data Mining Techniques and Their Application. International Journal of Computer Science and Information Technology, 3(1), 206-212. DOI: https://doi.org/10.62051/ijcsit.v3n1.26

Yang, Y., Jiménez-Negrón, O. A., & Kitchin, J. R. (2021). Machine-learning accelerated geometry optimization in molecular simulation. The Journal of Chemical Physics, 154(23). DOI: https://doi.org/10.1063/5.0049665

Lauscher, A., Glavaš, G., Ponzetto, S. P., & Vulić, I. (2020, April). A general framework for implicit and explicit debiasing of distributional word vector spaces. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 05, pp. 8131-8138). DOI: https://doi.org/10.1609/aaai.v34i05.6325

Xu, T. (2024). Comparative Analysis of Machine Learning Algorithms for Consumer Credit Risk Assessment. Transactions on Computer Science and Intelligent Systems Research, 4, 60-67. DOI: https://doi.org/10.62051/r1m3pg16

Xu, T. (2024). Credit Risk Assessment Using a Combined Approach of Supervised and Unsupervised Learning. Journal of Computational Methods in Engineering Applications, 1-12. DOI: https://doi.org/10.62836/jcmea.v4i1.040105

Zhang, Y., Yang, K., Wang, Y., Yang, P., & Liu, X. (2023, July). Speculative ECC and LCIM Enabled NUMA Device Core. In 2023 3rd International Symposium on Computer Technology and Information Science (ISCTIS) (pp. 624-631). IEEE. DOI: https://doi.org/10.1109/ISCTIS58954.2023.10213102

Tu, H., Shi, Y., & Xu, M. (2023, May). Integrating conditional shape embedding with generative adversarial network-to assess raster format architectural sketch. In 2023 Annual Modeling and Simulation Conference (ANNSIM) (pp. 560-571). IEEE.

Vaidya, A., Mai, F., & Ning, Y. (2020, May). Empirical analysis of multi-task learning for reducing identity bias in toxic comment detection. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 14, pp. 683-693). DOI: https://doi.org/10.1609/icwsm.v14i1.7334

Xia, Y., Liu, S., Yu, Q., Deng, L., Zhang, Y., Su, H., & Zheng, K. (2023). Parameterized Decision-making with Multi-modal Perception for Autonomous Driving. arXiv preprint arXiv:2312.11935. DOI: https://doi.org/10.1109/ICDE60146.2024.00340

Lin, Y. (2023). Construction of Computer Network Security System in the Era of Big Data. Advances in Computer and Communication, 4(3). DOI: https://doi.org/10.26855/acc.2023.06.015

Liu, M., & Li, Y. (2023, October). Numerical analysis and calculation of urban landscape spatial pattern. In 2nd International Conference on Intelligent Design and Innovative Technology (ICIDIT 2023) (pp. 113-119). Atlantis Press. DOI: https://doi.org/10.2991/978-94-6463-266-8_13

Lin, Y. (2023). Optimization and Use of Cloud Computing in Big Data Science. Computing, Performance and Communication Systems, 7(1), 119-124. DOI: https://doi.org/10.23977/cpcs.2023.070115

Lin, Y. Discussion on the Development of Artificial Intelligence by Computer Information Technology.

Qiu, L., & Liu, M. (2024). Innovative Design of Cultural Souvenirs Based on Deep Learning and CAD. DOI: https://doi.org/10.14733/cadaps.2024.S14.237-251

Wang, C., Yang, H., Chen, Y., Sun, L., Zhou, Y., & Wang, H. (2010). Identification of Image-spam Based on SIFT Image Matching Algorithm. JOURNAL OF INFORMATION &COMPUTATIONAL SCIENCE, 7(14), 3153-3160.

Yucer, S., Akçay, S., Al-Moubayed, N., & Breckon, T. P. (2020). Exploring racial bias within face recognition via per-subject adversarially-enabled data augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 18-19). DOI: https://doi.org/10.1109/CVPRW50498.2020.00017

Zhang, L., & Yencha, C. (2022). Examining perceptions towards hiring algorithms. Technology in Society, 68, 101848. DOI: https://doi.org/10.1016/j.techsoc.2021.101848