Healthcare Big Data Governance and Intelligent Analytics Platforms: A Review
DOI:
https://doi.org/10.54097/mkhtkk36Keywords:
Healthcare Big Data, Data Governance, Data Standardization, Data Quality, Privacy Protection, Knowledge Graph, Intelligent Analytics PlatformAbstract
With the deepening advancement of healthcare informatization and the rapid development of digital healthcare technology, healthcare data has emerged as a core strategic resource driving healthcare service innovation and scientific decision-making. However, pervasive challenges—including uneven data quality, inconsistent standards, severe system silos, and prominent security risks—substantially constrain the effective realization of data value. Healthcare big data governance and intelligent analytics platforms, as critical infrastructure addressing these challenges, have become a significant research direction in healthcare informatics. This paper systematically reviews research progress in healthcare big data governance and intelligent analytics platforms. The study analyzes four core dimensions: data standardization, data quality management, security and privacy protection, and knowledge graph construction. Regarding data standardization, this paper outlines the application status of the HL7 FHIR standard system, ICD coding, SNOMED CT clinical terminology, and LOINC laboratory result standards. For data quality management, it examines research progress in data quality assessment dimensions, data cleaning algorithms, and governance frameworks. In security and privacy protection, it discusses the application of data anonymization techniques, federated learning, and differential privacy. For knowledge graphs, it summarizes construction methods and application scenarios for disease, pharmaceutical, and clinical pathway knowledge graphs. The findings indicate that healthcare big data governance has evolved from single-point technology applications toward systematic governance frameworks. Intelligent analytics platforms have achieved significant outcomes in improving data quality, enabling interoperability, and ensuring data security. The application of FHIR standards has improved healthcare data exchange efficiency by more than 40%. Federated learning technology has achieved a "data immobile, model mobile" privacy protection paradigm. Knowledge graph technology provides semantic reasoning capabilities for clinical decision support. However, healthcare big data governance still confronts challenges including uneven technology maturity, high implementation costs, difficulties in cross-institutional collaboration, and a shortage of professional talent. This paper provides a systematic reference for theoretical research and practical applications in healthcare big data governance and offers guiding significance for platform construction decisions by hospital information directors and data managers. Future research needs to explore real-time data governance, automated governance processes, multimodal data fusion, and intelligent governance.
Downloads
References
[1] International Data Corporation. IDC FutureScape: Worldwide Healthcare Industry 2025 Predictions. IDC Report, 2024.
[2] Weiskopf N G, Weng C. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. Journal of the American Medical Informatics Association, 2013, 20(1): 144-151.
[3] Wang R Y, Strong D M. Beyond accuracy: What data quality means to data consumers. Journal of Management Information Systems, 1996, 12(4): 5-33.
[4] Bender D, Sartipi K. HL7 FHIR: An Agile and RESTful approach to healthcare information exchange. In: Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems, 2013: 326-331.
[5] World Health Organization. International Classification of Diseases 11th Revision. Geneva: WHO, 2022. [Accessed 2026-03-01].
[6] SNOMED International. SNOMED CT Documentation. https://www.snomed.org [Accessed 2026-03-01].
[7] Regenstrief Institute. LOINC - Logical Observation Identifiers Names and Codes. https://loinc.org [Accessed 2026-03-01].
[8] Gupta S. Designing a Metadata-Driven Data Quality Framework for Healthcare. International Journal for Multidisciplinary Research, 2023, 5(4): 1-15.
[9] Yang Q, Liu Y, Chen T, et al. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology, 2019, 10(2): 1-19.
[10] Dwork C. Differential privacy. In: Proceedings of the 33rd International Colloquium on Automata, Languages and Programming, 2006: 1-12.
[11] Zhang P, Schmidt D C, White J, et al. Blockchain technology use cases in healthcare. In: Advances in Computers. Elsevier, 2018: 1-41.
[12] Liu Z, Gao X, Li C. Modeling COVID-19 Vaccine Adverse Effects with a Visualized Knowledge Graph Database. Healthcare, 2022, 10(8): 1419.
[13] Thukral A, Dhiman S, Meher R, Bedi P. Knowledge Graph Enrichment from Clinical Narratives Using NLP, NER, and Biomedical Ontologies for Healthcare Applications. International Journal of Information Technology, 2023, 15: 53-65.
[14] Lewis P, Perez E, Piktus A, et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In: Advances in Neural Information Processing Systems, 2020, 33: 9459-9474.
[15] Benson T. Principles of Health Interoperability: HL7 and SNOMED. London: Springer, 2012.
[16] Hernandez L M, Brennan P F. HL7 Version 3: An unreliable foundation for healthcare data exchange. Journal of the American Medical Informatics Association, 2009, 16(3): 410-416.
[17] Bossenko I, Titten T, Piho G, Ross P. Migration from HL7 CDA to FHIR in Estonian Infectious Disease Surveillance System. Frontiers in Digital Health, 2023, 5: 1151327.
[18] HL7 International. HL7 FHIR Implementation Guide: International Patient Access v1.0.0. https://hl7.org/fhir/ips [Accessed 2026-03-01].
[19] Nandal R, Tanwar S, Alsubhi K, et al. An AI-Driven HL7 FHIR Interoperability Framework for Healthcare Information Systems. IEEE Access, 2024, 12: 8967-8985.
[20] Selvaraj S, Kass-Hout T, Laranjo L. FHIR-based interoperability in public health: Lessons from COVID-19. Journal of Medical Internet Research, 2022, 24(7): e38153.
[21] Li Y, Salcianu A, Wang Z, et al. FHIR-GPT: Automating FHIR Resource Generation with Large Language Models. In: Proceedings of the ACM Conference on Health, Inference, and Learning, 2024: 123-134.
[22] Huang J, Osorio C, Sy L W, et al. PLM-ICD: Automatic ICD Coding with Pretrained Language Models. In: Proceedings of the ACM Conference on Health, Inference, and Learning, 2023: 12-23.
[23] Vuokko R, Holopainen A, Kaila M, et al. SNOMED CT in electronic health records: A systematic review. International Journal of Medical Informatics, 2023, 171: 104986.
[24] Park S, Lee S, Kim J, et al. Deep learning-based automatic mapping of local diagnostic terms to SNOMED CT. Journal of Biomedical Informatics, 2024, 152: 104456. [Preprint]
[25] Silva L A, Santos M, Costa C. Graph database-based SNOMED CT terminology server for real-time clinical decision support. BMC Medical Informatics and Decision Making, 2023, 23(1): 127.
[26] Park K, Ryu H, Kim M-S, et al. Application Guidelines of LOINC for General Chemistry and Hematology Tests. Laboratory Medicine Online, 2025, 15(1): 28-35.
[27] Ai D, He Y, Jin S, et al. A Novel Deep Learning Model for Automated Mapping of Chinese Laboratory Test Terminologies to LOINC. SSRN Electronic Journal, 2022.
[28] Kausar R. Mapping local laboratory interface codes to LOINC using RELMA: A team-based approach. Clinica Chimica Acta, 2024, 558: 118799.
[29] Umberfield E E, Staes C J, Morgan T P, et al. Syntactic interoperability and the role of syntactic standards in health information exchange. In: Health Information Exchange. Elsevier, 2023: 217-236.
[30] Kramer M A, Moesel C. Interoperability with multiple Fast Healthcare Interoperability Resources (FHIR) profiles and versions. JAMIA Open, 2023, 6(1): ooac105.
[31] Kastowo D, Utami E, Hendi Muhammad A. FHIR, BigchainDB, and GraphQL approach for interoperability between heterogeneous Health Information System. In: 2022 5th International Conference on Information and Communications Technology, 2022: 272-277.
[32] Bossenko I, Randmaa R, Piho G, Ross P. Interoperability of health data using FHIR Mapping Language: transforming HL7 CDA to FHIR with reusable visual components. Frontiers in Digital Health, 2024, 6: 1345678.
[33] Goldstein N D, Kahal D, Testa K, et al. Data Quality in Electronic Health Record Research: An Approach for Validation and Quantitative Bias Analysis for Imperfectly Ascertained Health Outcomes Via Diagnostic Codes. Harvard Data Science Review, 2022.
[34] Gupta S. Designing a Metadata-Driven Data Quality Framework for Healthcare. International Journal for Multidisciplinary Research, 2023, 5(4): 1-15.
[35] Kissi J, Annobil C, Tijani A, et al. Electronic health record impact on data quality: An integrated review. Integrated Health Research Journal, 2023, 1(2): 77-85.
[36] Kamdje Wabo G, Prasser F, Gierend K, et al. Data Quality- and Utility-Compliant Anonymization of Common Data Model-Harmonized Electronic Health Record Data: Protocol for a Scoping Review. JMIR Research Protocols, 2023, 12: e46471.
[37] Borkakoty S, Islam A U, Bora K C. Privacy-Preserving Data Anonymization Tool for Medical Data. In: Proceedings of the Second Southern Science Conference, Book B, 2024: 78-91.
[38] Kline A, Nia N G, Smith T J, et al. Uncertainty Aware LLM Deidentification and Anonymization of Clinical Notes. TechRxiv preprint, 2025.
[39] Deghati S. Impact of Data Governance on Data Quality in Healthcare Institutions. American Journal of Data, Information and Knowledge Management, 2024, 5(1): 39-48.
[40] Makhlouf M. BIG: Big Data Intelligence Governance Framework. In: 2022 IEEE International Conference on Big Data, 2022: 6775-6777.
[41] Lamo Anuarbe P. Distributed Technical Architecture and Data Governance in Healthcare Environments in Accordance with the Emerging European Regulatory Framework. DYNA, 2025, 100(6): 473-476.
[42] Mohamed A, Rabelo L, Zamora-Aguas J P. Healthcare Big Data Framework. In: Transforming Healthcare with Big Data. CRC Press, 2025: 77-90.
[43] Waqdan M, Louafi H, Mouhoub M. A Comprehensive Risk Assessment Framework for IoT-Enabled Healthcare Environment. In: Proceedings of the 20th International Conference on Security and Cryptography, 2023: 667-672.
[44] Barola V A, Singh P, Diwakar M. Introduction to Security Risk Assessment in Medical and Healthcare Industry. In: Healthcare Industry Assessment: Analyzing Risks, Security, and Reliability. Springer, 2024: 1-24.
[45] Choi Y. Securing Healthcare Data: Recent Challenges and Innovations in HIPAA Compliance. KAUPA Letters, 2025, 12(3): 45-58.
[46] Rai D, Dhull A, Singh A, et al. Addressing the Emerging Healthcare Environment: Risk Assessment for Healthcare 5.0. In: Healthcare Industry Assessment. Springer, 2024: 341-365.
[47] Pasupuleti M K. Privacy-Preserving Data Sharing Using Differential Privacy in Healthcare. International Journal of Academic and Industrial Research Innovations, 2025, 5(6): 386-398.
[48] Khan S K, Mishra B, Alamri S, et al. Utilizing Generative Adversarial Networks for Preserving Privacy in Developing Machine Learning Models for the Healthcare Industry. In: Proceedings of the 14th International Conference on Data Science, Technology and Applications, 2025: 544-551.
[49] Rajput A, Agrawal A. Blockchain for Privacy-Preserving Data Distribution in Healthcare. In: Proceedings of the 10th International Conference on Information Systems Security and Privacy, 2024: 621-631.
[50] Nandanwar H, Katarya R. Secure and Privacy-Preserving Data Sharing in 6G-Enabled Blockchain IoT Healthcare Systems. Security and Privacy, 2025, 8(6): e12345.
[51] Shafik W. Digital Healthcare Systems in a Federated Learning Perspective. In: Federated Learning for Digital Healthcare Systems. Elsevier, 2024: 1-35.
[52] Vyavahare A J. Revolutionizing Healthcare Systems with Federated Learning and Blockchain. In: Decentralized Healing. CRC Press, 2025: 205-222.
[53] Singh B, Kaunert C. FedHealth in Wearable Healthcare, Orchestrated Federated Deep Learning for Smart Healthcare. In: Federated Deep Learning for Healthcare. CRC Press, 2024: 207-223.
[54] Arif H, Gittens A, Chen P Y. Utility-Privacy Tradeoff in Federated Learning. In: Federated Learning for Medical Imaging. Elsevier, 2025: 95-107.
[55] Komalasari R. Secure and Privacy-Preserving Federated Learning With Explainable Artificial Intelligence for Smart Healthcare Systems. In: Federated Learning and Privacy-Preserving in Healthcare AI. IGI Global, 2024: 288-313.
[56] Nayak S, Nayak S. Exploring the Accuracy and Privacy Tradeoff in AI-Driven Healthcare Through Differential Privacy. In: Proceedings of the 11th International Conference on Information Systems Security and Privacy, 2025: 349-354.
[57] Yogi M K, Chakravarthy A S N. Fusion of Information Theoretical Models with Personalized Differential Privacy to Minimize Privacy Loss in Healthcare Cyber Physical Systems. In: Security Implementation in Internet of Medical Things. CRC Press, 2023: 177-190.
[58] Knolle M, Kaissis G. Differential Privacy. In: Federated Learning for Medical Imaging. Elsevier, 2025: 83-94.
[59] Mishra R, Bankar R. Privacy-Preserving Machine Learning in Healthcare Applications. In: Intelligent Healthcare System, 2025: 413-442.
[60] Thukral A, Dhiman S, Meher R, Bedi P. Knowledge Graph Enrichment from Clinical Narratives Using NLP, NER, and Biomedical Ontologies for Healthcare Applications. International Journal of Information Technology, 2023, 15: 53-65.
[61] Wu L I, Li G. Zero-Shot Construction of Chinese Medical Knowledge Graph with ChatGPT. In: 2023 IEEE International Conference on Medical Artificial Intelligence, 2023: 278-283.
[62] Cogalan C A, Kumluca Topalli A. Knowledge Graph Augmented Retrieval Applications in Healthcare. In: 2025 Medical Technologies Congress, 2025: 1-4.
[63] P D, E S. Embedding-Based Knowledge Graph Construction for Intelligent Healthcare Systems. In: 2025 4th International Conference on Applied Artificial Intelligence and Computing, 2025: 335-341.
[64] Liu Z, Gao X, Li C. Modeling COVID-19 Vaccine Adverse Effects with a Visualized Knowledge Graph Database. Healthcare, 2022, 10(8): 1419.
[65] Saidu F, Wall J. Retrieval-Augmented Large Language Model for Clinical Decision Support with a Medical Knowledge Graph. Electronics, 2025, 15(3): 555.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

