Collaboration and Future Directions in Artificial Intelligence for Neonatology
Article information
Abstract
Amidst the dual crisis of increasing birth rate of high-risk newborns and shortage of neonatologists in Korea, neonatal intensive care units (NICU) have become a complex medical environment that generates vast amounts of data. Artificial Intelligence (AI) technology is gaining attention as an innovative solution in data-driven environments, enabling the early prediction of diseases, optimization of treatment decisions, and reduction of medical staff workload. This review aimed to examine the historical development of AI and explore areas where AI can be applied in various clinical fields of neonatology. Furthermore, it analyzes Korean medical AI research trends and related policies to shed light on their status, proposes effective measures for establishing an AI research infrastructure utilizing NICU data and interdisciplinary collaboration models, and suggests future directions for AI research in the field of neonatology. AI has evolved through two 'winters' into the current era of generative AI, with core technologies such as machine learning, deep learning, and large-language models being applied in neonatology. Key application areas include the early prediction of conditions such as sepsis, necrotizing enterocolitis, and bronchopulmonary dysplasia using heart rate characteristics, medical imaging, and multimodal data, as well as treatment optimization through Clinical Decision Support Systems. In Korea, R&D is being actively promoted under government leadership, including the establishment of the 'Korean Specialized Big Data for Critical Care (K-MIMIC)' and the 'Advanced Research Projects Agency for Health (ARPA-H) Project,' while an institutional foundation is also being established with laws like the 'Digital Medical Products Act.' For the successful implementation of AI, a standardized multicenter data infrastructure such as the Korea Neonatal Network and a clinician-led industry– academia–research–hospital collaboration model are essential. Future AI research in neonatology must move beyond static risk prediction toward a dynamic intervention recommendation system that suggests optimal treatments based on real-time data. Ultimately, the goal is to establish a Smart NICU that realizes precision, automation, and remote capabilities for neonatal care. To this end, by having medical professionals lead the research ecosystem and strengthen interdisciplinary collaboration, AI will play a decisive role in saving the lives of newborns and ensuring their healthy future.
INTRODUCTION
Although the medical field is generally slow in adopting new technologies, artificial intelligence (AI) is a notable exception and is being applied at a remarkably fast pace. According to a systematic review published in 2022, neonatology accounts for the most significant proportion (24%) of machine learning (ML)-related research in pediatrics, indicating high academic interest [1]. However, its real-life clinical implementation remains limited.
Considering the vast amount of data generated in neonatal intensive care unit (NICU), the potential of AI applications is immense. Since 2020, data-driven research in the neonatal field has surged, and this trend has continued steadily, even during the coronavirus disease 2019 (COVID-19) pandemic [2]. Key research trends can be summarized as follows: technological innovations, such as AI-based disease prediction, image analysis, and continuous monitoring through wearable devices [3]; the rise of family centered care and telemedicine technologies [4]; and biomarker research for personalized medicine [5]. Research methodologies are also advancing, with innovative clinical trial designs such as Bayesian methods [6], quality improvement (QI) studies, and large-scale, electronic medical record (EMR)-based, real-world, evidence-based studies [7].
However, challenges remain to be addressed, including a lack of high-quality prospective studies [8], data imbalance and equity issues in AI model development [3], and the gap between innovative technology and actual clinical applications [9]. In the future, technology-convergent research through interdisciplinary collaborations with fields such as engineering and applied sciences is expected to become more active, which is expected to drive the advancement of neonatology through the clinical application of AI-based monitoring systems, wearable devices, and telemedicine technologies.
This review examines the historical development of AI and, based on this, aims to explore areas where AI can be applied across various clinical fields in neonatology. Furthermore, by analyzing medical AI research trends and related Korean policies, this study highlights the status of AI. It proposes effective measures for establishing an AI research infrastructure that utilizes data from the NICU as well as interdisciplinary collaboration models. The ultimate goal was to synthesize these aspects and propose future directions for AI research in the field of neonatology.
WHY AI IN NEONATOLOGY NOW?
The number of high-risk newborns in South Korea, including those with low birth weight and premature infants, is increasing annually. Although the overall infant mortality is slightly decreasing, neonatal deaths still account for more than half of all infant deaths. Particularly in Korea, although the survival rate of extremely low birth weight infants (ELBWIs) has improved over the last decade, their long-term neurodevelopmental complications and healthcare costs continue to increase [10]. Premature and ELBWIs have immature organ systems, making them highly susceptible to injury. They also have a higher likelihood of congenital anomalies and genetic disorders, necessitating a precise diagnosis and intervention immediately after birth [11]. Diseases such as sepsis, necrotizing enterocolitis (NEC), metabolic disorders, and hypoxic-ischemic injury can deteriorate rapidly, making prompt diagnosis and initiation of treatment-critical factors in determining prognosis [12].
The current supply of neonatologists is insufficient to meet this demand. In South Korea, this situation is particularly challenging, because it is difficult to secure staff for night and holiday shifts. Recruiting specialists is even more difficult in small-to-medium-sized and regional hospitals in rural areas, and an increasing number of rural NICUs operate with one, two, or even fewer neonatologists. In such an on-call system, the frequency of shifts increases, leading to an overwhelming burden of responsibility and accumulation of fatigue among neonatologists. This manpower shortage results in burnout from repetitive tasks, such as patient monitoring and medical record keeping; delays in responding to critical emergencies; and a lack of emotional support for families, including explanations, education, and discharge preparation, ultimately leading to a decline in the quality of care in the NICU [13].
Advancements in AI technology are driving innovation, particularly in the medical field. Protein structure prediction technology, awarded the 2024 Nobel Prize in chemistry, has dramatically shortened the drug development timeline [14], and deep-learning-based medical image analysis by AI contributes to the early detection of cancer and rare diseases by interpreting computed tomography, magnetic resonance imaging, and X-ray images [15]. In clinical settings such as the NICU, AI models that analyze patient vital signs in real time to predict emergencies such as sepsis have been developed and are being directly used to improve patient outcomes [16]. According to recent studies, ML-based sepsis prediction models can enable detection up to 10 hours earlier than conventional clinical indicators, demonstrating a sensitivity of 93.3% and a specificity of 80.0% [12,17]. Thus, the severe manpower shortage and high complexity of care in the neonatal field further amplify the need to integrate AI technology.
HISTORICAL DEVELOPMENT OF AI
AI began in the 1950s and has evolved through repeating cycles of 'booms' and 'winters.’ The field of AI was born at the Dartmouth Conference in 1956; however, early research, such as the perceptron, faced computational and theoretical limitations, leading to the first winter. In the 1980s, a second boom was sparked by 'expert systems' and the backpropagation algorithm but was followed by another winter owing to issues such as the vanishing gradient problem. The third boom began in the mid-1990s, starting with IBM's Deep Blue (1997). Geoffrey Hinton's deep learning (DL) research in 2006 served as a crucial turning point. Subsequently, with improvements in graphics processing unit performance and the advent of big data, DL models such as AlexNet (2012) and AlphaGo (2016) have demonstrated the potential of AI.
The emergence of GPT-1 (OpenAI) in 2018, based on transformer architecture, heralded the era of 'generative AI,' and the release of ChatGPT (OpenAI) in 2022 led to the widespread popularization of AI [18]. Competition intensified with rivals such as Google's Gemini, and in 2024, Geoffrey Hinton and the developers of AlphaFold were awarded Nobel Prizes in physics and chemistry, respectively, officially recognizing the profound impact of AI on science. As of 2025, AI is heralding the arrival of next-generation models with more sophisticated reasoning capabilities, and AI agent technology, which can independently plan and execute multiple steps, is on the rise. Furthermore, with the advancement of small language models specialized for specific industry domains and on-device AI technology, we are entering an era in which AI provides faster, more secure, and personalized services [19].
KEY AI TECHNOLOGIES UTILIZED IN NEONATOLOGY RESEARCH
AI is used to identify complex patterns in vast medical datasets that are difficult for humans to perceive, thereby aiding in disease prediction, diagnosis, and treatment. The core technologies can be summarized as ML, DL, and large-language models. ML can be divided into three main categories: supervised, unsupervised, and reinforcement learning. Supervised learning is widely used to predict the risk of disease by training on data with known outcomes ('labels'). For example, a model using XGBoost to predict neonatal nosocomial infections achieved high accuracy (area under the receiver operating characteristic, 0.911) and provided the rationale for its predictions through explainable AI techniques [20]. ML algorithms have also been applied to predict the need for the resuscitation of newborns after birth [21].
DL is a deep artificial neural network model that mimics the neural networks of the human brain. Convolutional neural networks (CNNs) been used in medical image analysis. They are used to analyze chest X-rays of premature infants for the early prediction of bronchopulmonary dysplasia (BPD) and are opening possibilities for noninvasive diagnosis, such as predicting patent ductus arteriosus [22,23]. Recurrent neural networks (RNNs/long short-term memories [LSTMs]) can analyze time-series biosignals such as electrocardiograms and electroencephalograms. Although various studies have been conducted in adults [24], their value warrants further research in the neonatal population.
Large-language models (LLMs) are based on the transformer architecture and are trained on vast amounts of text data. They can be utilized to summarize unstructured notes in EMRs or to translate complex information into explanations that are easy for patients and their caregivers to understand. Recently, a study was conducted to evaluate the concordance of LLMs such as ChatGPT-4 with standard treatment guidelines for NEC, which suggested their potential for clinical application, while also highlighting the need for thorough validation [25].
APPLICATION OF AI IN MAJOR NEONATAL DISEASES AND FIELDS
Neonatology AI research focuses on real-time risk screening, prognosis prediction, and early diagnosis, with key areas including survival analysis, infection prediction, respiratory management, and assessment of neurodevelopmental outcomes. AI plays a crucial role in predicting sepsis and often presents with nonspecific symptoms. There are successful cases, such as the 'Heart Rate Observation (HeRO) score,' which analyzes heart rate characteristics to reduce mortality, and various ML models that provide risk alerts 12 to 24 hours before the onset of clinical symptoms. Recently, a deep neural network model demonstrated high performance in mortality prediction (area under the curve, 0.923) [26], and an eXtreme Gradient Boosting (XGBoost) model showed an early warning effect, on average, 10 hours earlier than diagnosis [12]. For NEC and BPD, active research is underway to diagnose NEC early by integrating data such as heart rate variability or multimodal data (e.g., stool microbiome) [27], and to predict the risk of BPD by combining perinatal clinical data with genomic data [28]. In the case of retinopathy of premature, deep-learning-based automated analysis of retinal images supplements manual examinations by specialists, enhancing diagnostic efficiency and consistency [29]. In the form of Clinical Decision Support Systems (CDSS), AI optimizes personalized oxygen therapy and total parenteral nutrition prescriptions, and predicts the need for invasive mechanical ventilation more accurately than conventional methods [30].
Current studies show a trend of primarily using XGBoost for structured data, RNN/LSTM for time-series signals, and CNN for images. Although the model performance is encouraging, most studies have been limited to single-center retrospective validation [12], with only a few multicenter studies [26,31]. Future research must focus on demonstrating clinical utility through multicenter external validation and prospective clinical trials, while also addressing data sharing, ethical issues, and explainability for application in real-world clinical settings [32].
The applications of AI in neonatology can be summarized into four main areas. First, it provides medical insights that surpass human intuition by analyzing vast, high-dimensional data generated in the NICU [33]. Second, in the face of severe healthcare workforce shortages, it enhances operational efficiency by automating repetitive tasks such as monitoring, alerts, and document summarization, thereby allowing medical staff to focus on the care of high-risk patients. Third, it enhances the quality of care by enabling early risk detection and establishing rapid response systems through real-time data analysis [34]. Finally, it enables the implementation of personalized precision medicine through a CDSS, which suggests treatments tailored to the characteristics of individual patients [35].
KOREAN NICU-RELATED MEDICAL AI RESEARCH TRENDS AND POLICIES
Under Korean government leadership, the research and development of medical AI is being actively promoted, with specific research projects underway that directly impact the improvement of the critical care medical environment. These trends can be broadly examined in two categories: R&D projects and infrastructure establishment, and regulatory and institutional support.
Several national projects are in progress to establish a data infrastructure and develop technology that forms the foundation of intensive critical care AI research. The Korean Specialized Big Data for Critical Care (K-MIMIC) and AI CDSS Development Project (2021–2025), led by the Ministry of Health and Welfare, aims to build a high-quality big data infrastructure specialized for the Korean population by integrating critical care data from adults, children, and newborns. These results are expected to be utilized as foundational data for developing specialized AI models and CDSS for critical care in the future. The 'Korean Advanced Research Projects Agency for Health (ARPA-H) Project,' benchmarked against the U.S. ARPA-H, has selected 'Essential Healthcare Innovation' as one of its five core missions under the leadership of the Ministry of Health and Welfare and the Korea Health Industry Development Institute. This is expected to directly contribute to solving difficult challenges such as improving critical care management systems. Furthermore, the 'AI-based Intensive Critical Care Management and Transfer Optimization System' project (2025–2028) will develop an AI monitoring platform that will integrate and analyze multicenter critical care data in real-time. Efforts have also been made to resolve healthcare disparities. The 'Korean electronic-intensive care unit (e-ICU) Establishment Project’ utilizes information and communication technology (ICT) to enable specialists at hub hospitals to remotely provide consultations for regional hospital NICUs. This can provide substantial help in raising the standard of care in regional hospitals, where specialists are critically lacking. While various R&D projects are underway to address issues related to critical care and workforce shortages, most are targeted at adults and have a low direct relevance to neonates.
An institutional foundation to ensure that innovative AI technologies can be safely introduced into clinical practice is also under establishment. The 'Digital Medical Products Act,' effective from January 2025, provides the legal basis for the swift market entry and safe use of AI-based medical devices (SaMD). In January 2025, the Ministry of Food and Drug Safety established and published the world's first 'Guidelines for Approval and Review of Generative AI Medical Devices.’ These guidelines clarify the safety and efficacy evaluation criteria for generative AI in areas such as medical image interpretation, diagnostic assistance, and treatment planning. They also specifically address potential risk factors such as data bias, lack of accuracy, ethical issues, and considerations for the approval and review process. This study is expected to provide crucial directions for the development and commercialization of innovative medical AI technologies that utilize LLMs. Furthermore, in May 2025, the Ministry of Food and Drug Safety revised the Guidelines for Designing Clinical Trial Methods for AI-Applied Digital Medical Devices, presenting methods for designing clinical trials to validate the efficacy of AI medical devices. These include considerations for designing retrospective and prospective clinical studies, establishing a reference standard, and methods for performance comparison. However, in the neonatal field, medical devices that apply AI technology remain limited and require continuous interest and effort for development and translation into clinical practice.
MEASURES FOR ESTABLISHING AI RESEARCH INFRASTRUCTURE USING NICU CLINICAL DATA
Multimodal data refer to the integration of various types of medical data, such as text, images, and numerical values, to gain a multifaceted and comprehensive understanding of an individual patient's condition. The NICU is a data-intensive environment where various forms of multimodal data are generated and collected to assess an infant's condition from multiple perspectives. First, time-series data, such as oxygen saturation, heart rate, and blood pressure, are measured in real time using patient-monitoring devices. Blood test results and treatment records constitute important structured data. In addition, medical images, such as chest X-rays, cranial ultrasound, and near-infrared spectroscopy, are representative of unstructured data, providing both structural and functional information about internal organs. Furthermore, text data such as admission/discharge summaries, progress notes, and nursing records written by medical staff contain rich narrative information about the patient's condition and treatment course. Recently, new forms of data have been actively acquired through wearable devices and sensors that detect subtle movements within an incubator or continuously monitor the body temperature.
It is essential to standardize and integrate data from multiple institutions to develop a reliable AI model. An AI model developed using data from a single hospital has a high risk of overfitting to the characteristics of that specific institution. Training a model on multicenter data from diverse environments enhances its external applicability and generalizability. A model that has undergone multicenter validation will have higher acceptance in clinical practice and can be used as evidence for future policymaking and insurance reimbursement. If hospitals use different data formats, units, and codes, the AI model cannot recognize the data correctly. Standardization, such as unifying variable definitions and event criteria using a common data model (CDM), must be performed to ensure the reproducibility and scalability of research.
Among the major NICU datasets, Korea has the Korean Neonatal Network (KNN), Pediatric CDM Establishment Project, and K-MIMIC Dataset. Internationally, the Medical Information Mart for Intensive Care (MIMIC)-III dataset includes the NICU information. The KNN is Korea's leading multicenter neonatal registry system and serves as a core infrastructure for neonatal research and QI activities [36]. It registers approximately 2,000 ELBWIs annually, encompassing 85% to 90% of all cases in Korea. As of October 2025, the country has a cumulative cohort of over 25,600 premature infants. This ensures high data quality through web-based real-time validation and regular onsite monitoring [37]. Recently, enrollment criteria have been expanded to include extremely premature infants born at less than 32 weeks of gestation and plans are underway to collaborate with international networks [36]. As part of the 'Lee Kun-hee Project for Conquering Pediatric Cancer and Rare Diseases,' an infrastructure called the Pediatric CDM Establishment Project is being established to integrate data on severe pediatric diseases from multiple hospitals based on a CDM. This project also involves standardizing NICU discharge record data. The K-MIMIC Dataset is a Korean critical care dataset developed through the 'Korean Specialized Big Data for Critical Care and AI·CDSS Development' project. It benchmarks the U.S. MIMIC dataset, reflecting Korea's clinical environment and EMR structure. This dataset also includes data from Korean NICUs and is expected to be publicly released starting in 2025, contributing to the vitalization of medical AI research in Korea.
Among the easily accessible international open datasets for critical care, the MIMIC dataset includes NICU data. It consists of de-identified electrical health record data from critically ill patients at the Beth Israel Deaconess Medical Center in Boston, USA. It is now available to researchers worldwide after completion of the required training program. Notably, MIMICIII includes not only adult patients but also the admission records of 7,870 newborns (from 2001 to 2008), which has facilitated various studies, including those on neonatal mortality and sepsis prediction [38].
Future directions for establishing a NICU data infrastructure may involve leveraging and expanding existing platforms. One approach is to expand the KNN-based multimodal data platforms. Building on the strengths of the existing KNN cohort, it can be developed into a neonatal-specific multimodal data platform that integrates diverse data types such as biosignals, images, and EMRs, while establishing robust systems for data quality management and real-time collection. Another strategy involves utilizing data based on CDM and federated learning. Data standardization between Korean and international hospitals can be promoted through a CDM. Simultaneously, federated learning models can be actively applied to multicenter collaborative research, while protecting sensitive personal information.
In addition, a 'Smart NICU' equipped with a real-time monitoring system that utilizes the latest Internet of things (IoT) and wearable sensor technologies can be implemented. The automation of noninvasive and continuous data collection should be considered by incorporating innovative technologies such as blockchain-based security frameworks, smart textile integrated systems, wearable healthcare systems, and realtime incubator monitoring. Furthermore, efforts should be made to enhance the precision of diagnosis and prediction using new neuromonitoring technologies and the analysis of environmental factors [20]. Moreover, for the development of customized QI programs tailored to NICU levels, it is necessary to establish QI projects that consider the characteristics of each level and to develop QI indicators and a monitoring system using multimodal data [39]. Accordingly, specific guidelines for disease prevention and care standardization should be established, and feedback and educational programs should be implemented.
STRATEGY FOR BUILDING AN INTERDISCIPLINARY COLLABORATION MODEL FOR AI RESEARCH USING NICU DATA
AI research in neonatology necessitates collaborative efforts among the industry, academia, research institutes, and hospitals to establish a comprehensive ecosystem encompassing technology development, clinical validation, and commercialization. Each entity has a distinct role: industry is responsible for technology commercialization, academia for basic research and talent development, research institutes for translational research, and hospitals for defining clinical problems, securing data, and conducting validations.
For successful collaboration, it is crucial to identify joint projects with clinicians at the center, involving all stakeholders from the problem-definition stage. Subsequently, a virtuous cycle must be created, in which a prototype is developed through data-sharing agreements and the operation of a joint research platform, which is then validated in hospitals and mass-produced by the industry. Such successful models can reflect in government R&D policies and expanded nationwide.
As an example, in Korea, the 'Development of a Smart Incubator Platform for Neonatal Telemedicine' is being carried out as a national project under the 'Regional Medical Innovation R&D Project,' supervised by the Ministry of Health and Welfare and the Korea Health Industry Development Institute for 5 years from 2025 to 2029. This project involves numerous collaborating institutions, including Jeonbuk National University Hospital and Jeonbuk National University, specialized companies in customized telemedicine, and neonatal incubator manufacturers. The project aims to provide rapid remote medical consultations and AI-based decision-making technology for newborns in emergencies. The incubator developed in this study will be beneficial in clinical practice. However, it can also be developed into a model that leads to the nationwide expansion of other AI system development projects for predicting neonatal diseases.
A clinician-centered approach to project execution is the key to successful collaboration. R&D projects that address the unmet needs of the clinical field must be jointly developed by involving all stakeholders from the problem-definition stage. Following this, a virtuous cycle should be established through memoranda of understanding (MOUs) for data sharing and utilization, as well as the by operating joint research platforms to create prototypes, which can then be validated by medical institutions and mass-produced by the industry. This successful model can serve as a foundation for effectively responding to government R&D policies and scaling up successful cases from regional hub hospitals to national levels.
FUTURE DIRECTIONS FOR NICU RESEARCH
1. Paradigm shift in NICU AI research: from static prediction to dynamic intervention recommendation
Existing AI prediction models in the NICU have been limited to presenting static risks, such as ‘85% risk of BPD development,’ and have fallen short of guiding specific intervention timings or methods. Future AI research must evolve beyond static predictions to become a dynamic intervention recommendation system that suggests optimal treatments in real-time according to changes in the patient's condition. This signified a paradigm shift from predictive to active clinical decision support.
The core strategies to achieve this are as follows. First, the patient's risk trajectory must be dynamically tracked by integrating multimodal NICU data in real time. Second, a dynamic intervention recommendation system should be developed to present immediate response scenarios in which signs of deterioration are detected. Third, an intervention simulation function (What-if Analysis) should be introduced to predict the outcomes by applying virtual intervention scenarios. Fourth, a patient-customized recommendation algorithm should be implemented to suggest optimal treatments based on individual patient characteristics.
2. Long-term development direction: building a smart NICU and a clinician-led ecosystem
In the long term, the goal is to build a Smart NICU that realizes precision, automation, and remote capabilities in neonatal care through advanced AI technology, IoT-based monitoring, and real-time data integration. To achieve this, neonatal medical staff must take the lead in creating a research ecosystem. A specialized neonatal dataset that is distinct from other fields must be established, and a NICU-centered industry–academia– research–hospital consortium should be formed to lead government projects. Furthermore, collaboration with global neonatal networks should be strengthened to expand data standardization and joint research.
CONCLUSION
In the field of neonatology, AI is a powerful tool for addressing critical challenges faced by the Korean neonatal medical community, including an increase in high-risk newborns and a shortage of medical personnel. Its potential, from predicting major diseases such as sepsis and BPD to optimizing treatment, has been demonstrated in numerous studies. However, to turn this potential into a reality, neonatal medical professionals must take the lead in establishing research directions. Korean data infrastructures, such as the KNN and K-MIMIC, must be actively utilized, and technologies tailored to the needs of the clinical field must be developed through a collaborative system involving industry, academia, research, and hospitals. We must now move beyond static risk prediction to usher in the era of the 'Smart NICU,' which dynamically suggests the most appropriate treatments based on real-time data. Through these efforts, AI will play a crucial role in saving the lives of newborns, who are the world's most vulnerable patients, and will ultimately provide them with a healthy future.
Notes
Ethical statement
None
Conflicts of interest
No potential conflict of interest relevant to this article was reported.
Author contributions
Conception or design: H.H.K.
Acquisition, analysis, or interpretation of data: H.H.K.
Drafting the work or revising: H.H.K.
Final approval of the manuscript: H.H.K.
Funding
None
Acknowledgments
This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (RS-2023-00236157, RS-2025-02313278).
