Development and clinical application of an electronic health record quality control system for pulmonary aspergillosis based on guidelines and natural language processing technology
Original Article

Development and clinical application of an electronic health record quality control system for pulmonary aspergillosis based on guidelines and natural language processing technology

Zhengtu Li1#, Xidong Wang1#, Mengke Xu2#, Yongming Li1, Yinguang Wang2, Yijun Chen1, Shaoqiang Li1, Zhun Li1, Jinglu Yang1, Chun Tang1, Fangshu Xiong2, Wenhua Jian1, Peimei He2, Yangqing Zhan1, Jinping Zheng1, Feng Ye1

1State Key Laboratory of Respiratory Disease, National Clinical Research Center for Respiratory Disease, Guangzhou Institute of Respiratory Health, the First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China; 2Guangzhou Tianpeng Technology Co., Ltd., Guangzhou, China

Contributions: (I) Conception and design: Zhengtu Li, X Wang, Y Li; (II) Administrative support: F Ye, J Zheng, Y Wang; (III) Provision of study materials or patients: S Li, Y Zhan; (IV) Collection and assembly of data: Zhun Li, J Yang, C Tang, W Jian; (V) Data analysis and interpretation: M Xu, Y Chen, F Xiong, P He; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally to this work.

Correspondence to: Prof. Feng Ye, MS. The First Affiliated Hospital of Guangzhou Medical University of Guangzhou Medical University, 151 Yanjiang Xi Road, Guangzhou 510120, China. Email: tu276025@gird.cn; yefeng@gird.cn.

Background: There are considerable differences in the diagnosis and treatment of pulmonary aspergillosis (PA) between specialized hospitals and primary hospitals or developed areas and underdeveloped areas in China. There is a lack of electronic systems that assist respiratory physicians in standardizing the diagnosis and treatment of PA.

Methods: We extracted 26 quality control points from the latest guidelines related to PA, and developed a PA quality control system of electronic health record (EHR) based on natural language processing (NLP) techniques. We obtained PA patient records in the Department of Respiratory Medicine of the First Affiliated Hospital of Guangzhou Medical University to verify the effectiveness of the system comparing with manually evaluation of respiratory experts.

Results: We successfully developed quality control system of PA; 699 PA medical records from EHR of the First Affiliated Hospital of Guangzhou Medical University between January 2015 and March 2020 were obtained and assessed by the system; 162 defects were found, which included 19 medical records with diagnostic defects, 76 medical records with examination defects, and 80 medical records with treatment defects; 200 medical records were sampled for validation, and found that the sensitivity and accuracy of quality control system for pulmonary aspergillosis (QCSA) were 0.99 and 0.96, F1 value was 0.85, and the recall rate was 0.77 compared with experts’ evaluation.

Conclusions: Our system successfully uses medical guidelines and NLP technology to detect defects in the diagnosis and treatment of PA, which helps to improve the management quality of PA patients.

Keywords: Electronic health records (EHR); natural language processing (NLP); pulmonary aspergillosis (PA); guideline-based quality control system


Submitted Apr 18, 2022. Accepted for publication Aug 19, 2022.

doi: 10.21037/jtd-22-532


Introduction

Aspergillosis is the general term for a class of diseases caused by nearly 50 pathogenic or allergenic species of Aspergillus. Aspergillus conidia are the main pathogenic fungi, and they can induce a pulmonary fungal infection called pulmonary aspergillosis (PA). PA has been recognized as a major public health problem in recent years, with a global disease burden of up to 15 million and has become the most common fungal infection (1,2). Colonization of the lower respiratory tract is also common. Studies have shown that 37% of lung biopsy specimens of healthy adults have Aspergillus DNA (3). In addition, the lack of specific clinical symptoms and imaging evidence results in many missed diagnoses and misdiagnoses in patients with invasive pulmonary aspergillosis (IPA). One study showed that the mortality rate in underdiagnosed patients could reach 100%, and the rate of definite diagnosis at autopsy in patients with malignant tumours was 31% (4).

The Infectious Diseases Society of America (IDSA), European Society of Clinical Microbiology and Infectious Diseases (ESCMID)/European Confederation of Medical Mycology (ECMM)/European Respiratory Society (ERS), and European Organisation for Research and Treatment of Cancer (EORTC)/Mycoses Study Group (MSG) updated their guidelines in 2016, 2017, and 2019, respectively (5-7). Guidelines can guide clinical practice better. If we can normatively follow and use the standards established by the IDSA and ESCMID/ECMM/ERS guidelines, the majority of missed and misdiagnosis problems of PA infection can be solved in clinical practice. One study showed that the positive rate of sputum examination, which is the easiest test to carry out, was only 33.3% (8). The galactomannan (GM) antigen test is a quick and useful diagnostic tool for invasive PA, but it is not available in many countries/regions for patients who may develop PA [cancer, intensive care unit (ICU), which includes many rural areas in China] (9). Therefore, standardizing the diagnosis and treatment process for diseases and quality control at the medical record level may be the simplest, most effective and most economical solution to this problem.

Since 2000, improved natural language processing (NLP) techniques have enabled researchers to automatically identify these problems in clinical documents (10). Recently, Liang et al. designed an AI-based system using machine learning to extract clinically relevant features from electronic health records (EHR) notes to mimic the clinical reasoning of human physicians (11). An EHR-based predictive model was designed by Wang et al. to estimate the distant recurrence probability of breast cancer patients (12). In 2018, Shickel et al. published a survey and found a variety of deep learning techniques and frameworks being applied to several types of clinical applications including information extraction, representation learning, outcome prediction, phenotyping, de-identification (13). However, from the perspective of application scenarios, studies using medical guidelines for EHR quality control are still rare.

In this study, we developed an EHR-based quality control system for pulmonary aspergillosis (QCSA) that automatically extracts inpatients’ clinical manifestations, laboratory test results, imaging results, diagnoses and medications using NLP and then uses a knowledge graph based on guidelines for PA to identify any diagnostic or treatment problems. We present the following article in accordance with the STARD reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-22-532/rc).


Methods

Overview

In 2018, a project was launched to developed the QCSA system by checking EHR data for evaluation of diagnosis and treatment of PA patients. The project included four procedures: (I) selection of quality control points based on PA guidelines; (II) data structuration and normalization; (III) construction of functional modules on the QCSA system; and (IV) testing and validation of QCSA. Respiratory physicians from the First Affiliated Hospital of Guangzhou Medical University, and informatics technicians and software engineers from Guangzhou Tianpeng Technology Co., Ltd. were involved in this project.

Selection of quality control points based on PA guidelines

The QCSA was developed based on the guidelines to assist clinicians in judging the rationality of the diagnosis and treatment, reminded clinicians of the defects when the diagnosis and treatment principles of the guidelines are violated, and achieved the purpose of standardized diagnosis and treatment of PA. Two well-experienced respiratory experts thoroughly studied and analyzed guidelines related to PA, such as 2016 IDSA (5), 2017 ECCMID (6), and 2019 EORTC/MSG (7), to obtain key information of diagnosis, treatment and examination. The guideline diagnosis and treatment recommendations with a high-level evidence were extracted to form quality control points. We extract the entities of diagnosis, treatment and examination that appear in the quality control points, map them to the ontology library for standardization, and then form a standardized quality control logic. An example of how to use quality control points to find defects of PA patients’ electronic record is shown in Figure 1.

Figure 1 An example of how to use quality control points to find defects of PA patients’ electronic record. The example shows when a patient is diagnosed with IA and started treatment with voriconazole, the QCSA system is triggered, uses the quality control point of “Judge whether the physician has measured the trough serum concentration of voriconazole within 5 days after starting treatment”, and extracts relevant medical record information within 5 days after treatment to determine whether the defect exists and output reminder information (see Table S1, Table S2, https://cdn.amegroups.cn/static/public/jtd-22-532-1.xlsx, https://cdn.amegroups.cn/static/public/jtd-22-532-2.xlsx for all quality control points). IA, invasive aspergillosis; QC, quality control; PA, pulmonary aspergillosis; QCSA, quality control system for pulmonary aspergillosis.

Data structuration and normalization

For each patient, data were extracted from 6 EHR systems, including hospital information system (HIS), electronic medical record (EMR), picture archiving and communication system (PACS), laboratory information system (LIS), pathology system, ultrasound system. QCSA standardizes the structured data from HIS and LIS, by using terminology standards such as ICD10, ICD9-CM3, Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) and the common clinical medical terms (2019 edition) issued by the National Health Commission of the PRC (details in Table S3). Free-text data from EMR, PACS and pathology/ultrasound reports were transformed into structured data by named entity recognition (NER) (11,12). To improve the free-text information extraction effectiveness, a sequence annotation model-bidirectional long short-term memory (BiLSTM) combined with conditional random field (CRF) for NER was used, which has been confirmed by many studies to achieve good results (14-17).

Construction of functional modules of the QCSA system

QCSA includes three functional modules: the system home page and cockpit (Module 1), quality control details of every electronic medical record (Module 2), and rule setting (Module 3). Module 1 mainly shows the list of PA hospitalized and discharged patients, and the statistical results of quality control. Module 2 presents the detailed quality control results of each record and provides reminders and adjustment suggestions. Module 3’s main function is display and on-off detailed quality control rules.

Testing and validation of QCSA

QCSA was installed in the EHR system of the First Affiliated Hospital of Guangzhou Medical University to verify its effects. According to the ICD10 and SNOMED CT codes associated with the main diagnosis on the first page of medical record, we retrospectively obtained 699 PA patient records from 71,793 inpatient records in the Department of Respiratory Medicine of the First Affiliated Hospital of Guangzhou Medical University between January 2015 and March 2020. A team of 3 experienced respiratory clinicians reviewed, sampled, cross-checked the data. The three experts divided the sampled medical records to ensure that each medical record can be verified by two individuals. Comparing the consistency of the verification results of two people, the inconsistency was discussed by three experts and the final conclusion was reached. Expert verification results will be used as standard answers to test the effectiveness of QCSA.

Statistical analysis

We used descriptive analysis to understand the characteristics of the overall PA medical records. Since most medical data were non-normally distributed, statistics were performed in terms of gender distribution, median age, median number of hospital days, and PA classification of the population in the medical records. Accuracy, recall and F1 score were used to evaluate the model. The F1 score is a statistical measure of the accuracy of dichotomous models. It takes into account the accuracy and recall rate of the classification model at the same time, and because the positive and negative samples from the system quality control are not balanced, the effect of the system can be more objectively evaluated by the F1 value. We calculated the F1 score according to the following formula:

F1=2×(precision×recall)/(precision+recall)

Precision=truepositives/(truepositives+falsepositives)

Recall=truepositives/(truepositives+falsenegatives)

In order to ensure that the sampling results can be representative, we choose to draw a certain amount of samples in each diagnostic classification as the validation sample. We excluded 284 diagnostic untyped samples because no such control points were set up by QCSA. In addition, since there were no Aspergillus nodule (AN), subacute invasive aspergillosis (SAIA) and chronic necrotizing pulmonary aspergillosis (CNPA) typing in 699 medical records, 6 quality control points related to this could not be involved in the verification. Considering the verification workload of experts, validation samples were selected from each classification with a proportion of about 0.48, and a total of 200 cases of medical records were verified. A team of 3 experienced respiratory clinicians reviewed, abstracted, cross-checked the data. The three experts divided the sampled medical records to ensure that each medical record can be verified by two individuals (details in Table S4). Comparing the consistency of the verification results of two people, the inconsistency was discussed by three experts and the final conclusion was reached. Expert verification results will be used as standard answers to test the effectiveness of QCSA.

We use expert verification results as the gold standard to calculate the false positives, false negatives, true positives, and true negatives of the QCSA results to construct a confusion matrix and calculate the overall precision, recall, F1 and other evaluation indicators, so as to objectively evaluate this system. In addition, we also calculated the evaluation indicators of each type, which can accurately analyze the quality control effect of the system in different types.

Ethics approval

The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by Ethics Committee of the First Affiliated Hospital of Guangzhou Medical University (ethical approval number: 2018-119) and informed consent was taken from all the patients.


Results

Presentation of QCSA functions

QCSA has chief physician (CP) roles and physician (P) roles. The system alerts P role when it detects a deficiency in the patient’s medical record. P role can view the alert details and adopt quality control recommendations. If physician role thinks that the quality control information is inaccurate, the specific situation can be fed back. CP role reviews these deficiencies to determine if P role is properly addressing them, and determines which QC points are enabled or closed, and sets the weight score of each QC point so that the final score of each medical record can be counted (details in Figure S1).

QCSA system interface was displayed in Figure 2. The main page of the system (Figure 2A) shows the list of patients under the user’s jurisdiction according to the time sequence of admission, including hospitalized and discharged patients. The system homepage provides patient search functionality and displays the number of defects per medical record. We set up a cockpit interface to count the quality control results according to the time interval, which provided an intuitive understanding of the overall quality of the PA medical records (Figure 2B). In addition, users click every bar, cycle and number on the chart and the system will jump to the list of the original PA records corresponding to that dataset. CP role can also see the change trend of PA records’ defects in the past period of time, so as to improve the follow-up diagnosis and treatment of new patients.

Figure 2 Display of QCSA system interface. (A) System home page. This page is the first page after the user logs in, showing the PA patient information. The user can search the medical record by name, ID number, hospitalization number, etc. There is a colour mark on the right side of the medical record. Red indicates that the medical record is defective, and green indicates no defect. (B) Quality control cockpit, holistic analysis of medical records. This page shows the quality control statistics results of all medical records over time. (C) Quality control details. This page shows the specific situation of a single medical record, and the relevant quality control prompts and medical record scores are also shown on the right. (D) Rule setting. On this page, chief physician role can independently set which control points to enable or close. QCSA, quality control system for pulmonary aspergillosis; PA, pulmonary aspergillosis.

The quality control details page (Figure 2C) presents the detailed quality control results of each record and provides reminders and adjustment suggestions. The right side of the page shows the deficiency points found by the system, which are arranged according to the three modules of the diagnosis, treatment and examination. In addition, the positioning and feedback can be performed for the defect results to assist the physician in handling the defect.

The rule setting page (Figure 2D) shows the detailed quality control rules. CP role can decide which rules can be opened and which closed. For each control point rule, a weight value between 0–5 can be set. After the setting is completed, the system can automatically calculate the final score of each medical record, which is displayed on the right side of the page of Figure 2C.

When hospitalized patients are diagnosed with PA, the system will automatically turn on quality control. The system captures all current and past records of patients from HIS, EMR, PACS, LIS, pathology system, and ultrasound system, matches the system quality control points after data processing, and reminds the physician of any deviant behaviour.

Results of testing and validation of QCSA

After the PA guidelines were studied and analyzed, a total of 26 key quality control points were obtained. The extracted data from EHR were normalized and compared with 26 key quality control points to identify the diagnosis or treatment points that did not meet the guidelines (details in Figure S2).

We performed a descriptive analysis of 699 medical records, which showed 460 men (65.8%) and 239 women (34.2%). The median age was 56 years, and the median length of hospital stay was 8 days. According to the diagnostic classification, 699 PA medical records were divided into the following categories: 284 unspecified PA, 132 IPA, 121 chronic pulmonary aspergillosis (CPA), 78 allergic bronchopulmonary aspergillosis (ABPA), 44 chronic cavitary pulmonary aspergillosis (CCPA), 36 simple pulmonary aspergillosis (SA), and 2 chronic fiberoptic pulmonary aspergillosis (CFPA). SAIA, AN, and CNPA were 0 (details in Table S5).

QCSA assessed 699 records of patients with pulmonary Aspergillus, of which 162 were deficient; 19 medical records had diagnostic defects, 76 had examination defects, and 80 had treatment defects. The distribution of the defects in the disease classification was CPA (n=84), ABPA (n=34), IPA (n=29), SA (n=10), CCPA (n=9) and CFPA (n=2). It should be noted that there may be multiple deficiencies in each medical record. Of these, 538 records had 0 defects, 121 had 1 defect, 37 had 2 defects, and 3 had 3 defects.

By comparing the judgment results of quality control points by experts and QCSA respectively, we construct a confusion matrix to evaluate whether the system quality control has good consistency with the expert quality control. Because the QCSA system did not contain quality control points for diagnosing unclassified PA cases, we excluded 284 cases of unclassified PA cases from 699. As there are no medical records of AN, SAIA and CNPA among the 699 medical records, these quality control points of classification are not involved in this verification. We stratified and randomly selected 200 medical records from the remaining 415 medical records according to their diagnostic classification and sent them to the expert team for evaluation (Figure 3). The 200 medical records included ABPA (n=37), SA (n=17), IPA (n=64), CCPA (n=19), CFPA (n=4) and CPA (n=59). For the above diagnostic types participating in sampling verification, QCSA sets the number of quality control points as follows: ABPA (n=3), SA (n=1), IPA (n=7), CCPA (n=3), CFPA (n=2), CPA (n=4) (details in Table S1, Table S2, https://cdn.amegroups.cn/static/public/jtd-22-532-1.xlsx, https://cdn.amegroups.cn/static/public/jtd-22-532-2.xlsx). The expert team will manually judge whether such defects exist in the corresponding medical records according to the quality control points set for different types. As a result, compared with experts’ evaluation, the sensitivity and accuracy of QCSA were 0.99 and 0.96, F1 value was 0.85, and the recall rate was 0.77 (confusion matrix shown in Table 1, QCSA evaluation results of each classification and overall evaluation results of QCSA in Table S6, Table S7).

Figure 3 Data validation process. PA, pulmonary aspergillosis; QCSA, quality control system for pulmonary aspergillosis; ABPA, allergic bronchopulmonary aspergillosis; SA, simple pulmonary aspergillosis; IPA, invasive pulmonary aspergillosis; CCPA, chronic cavitary pulmonary aspergillosis; CFPA, chronic fiberoptic pulmonary aspergillosis; CPA, chronic pulmonary aspergillosis.

Table 1

Confusion matrix of verification results between expert team and QCSA

QCSA classification Defective according to expert No defect according to the expert
Defective according to QCSA 8.665% 0.342%
No defect according to QCSA 2.622% 88.369%

Of the 99 medical records of “Defective according to expert”, 76 pieces according to QCSA, which was 8.665% of the total medical records. There were 23 cases of “No defect according to QCSA” (2.622% of the total number of cases). In the 778 “No defect according to the expert”, three pieces “Defective according to QCSA”, which is 0.342% of the total number of pieces. There were 775 “No defect according to QCSA”, accounting for 88.369% of the total number of cases. QCSA, quality control system for pulmonary aspergillosis.


Discussion

There are considerable differences in the diagnosis and treatment of PA between specialized hospitals and primary hospitals or developed areas and underdeveloped areas (18). The lack of the latest diagnostic and treatment skills results in the wrong diagnosis and the wrong approach to the treatment of PA. Finding this difference is the basis of improving the diagnosis and treatment of PA (19,20). To date, there is no other effective or rapid way to determine the possible problems in the PA medical records, except for an experienced respiratory specialist manually cross-checking the choice of treatment plans and the precise control of the treatment window period in the medical records. In this study, we developed the QCSA and used it to identify defects in PA records. QCSA can quickly and correctly identify the problems of diagnosis and treatment,so it can be used to help doctors in lower level hospitals or underdeveloped areas improve their clinical diagnosis and treatment level of PA.

The biggest difference between QCSA and traditional quality control systems is that it is able to understand semantic connotation of free-text data in EHR rather than just formalize quality control (21). We use NLP and standardized techniques, which greatly reduces the burden of manual inspection (22,23). To the best of our knowledge, there have been a few of rule-based quality control studies of medical record, but AI research on the use of medical guidelines for disease quality control is still relatively rare (24). However, some systems or frameworks based on artificial intelligence and knowledge bases have been reported to help doctors make certain clinical decisions (11,25). Liang et al. proposed a data mining framework for EHR data that integrates prior medical knowledge and data-driven modelling (11). They developed a deep learning-based NLP system to extract clinically relevant information from 1,362,559 outpatient visits from 567,498 patients of the Guangzhou Women and Children’s Medical Center and subsequently established a diagnostic system based on the extracted clinical features. Finally, across all levels of the diagnostic hierarchy, the diagnostic system achieved a high level of accuracy between the predicted primary diagnoses based on the extracted clinical features by the NLP information model. Smith et al. developed the adverse drug effect recognizer (ADER), which could assist clinicians in detecting and addressing inpatients’ ongoing preadmission adversedrugreactions (ADRs) (26). Compared with controls, the ADER group more often withheld or discontinued suspected ADR-causing medications during the inpatient stay. All of the above studies show that a system based on AI and a knowledge base can provide assistant decision support for clinicians. The QCSA we developed can also help doctors judge the rationality of the diagnosis, determine the diagnosis and treatment deficiencies based on the guideline suggestions and give reasonable treatment suggestions.

Comparing with expert evaluation results, QCSA shows high sensitivity, accuracy and F1 score, which means that the system can effectively help doctors carry out quality control of the diagnosis and treatment of PA, standardize the diagnosis and treatment behaviour, and achieve the purpose of homogenizing diagnosis and treatment. The recall rate was poor, and we analyzed different PA subtypes and found that the problems mainly focused on CCPA subtype. The number of false negatives of CCPA subtypes accounts for 73% of all false negatives. After detailed analysis, it is found that there are differences between the quality control rules of QCSA and the judgment of medical experts. QCSA can judge whether a patient has used voriconazole, but it cannot evaluate the route, dose, and timing of administration. Medical experts can not only judge whether it is appropriate for patients to use voriconazole, but also judge whether the dose, route and timing of medication are appropriate. This is a reminder that we should not rely entirely on guidelines when developing the system, but should also communicate with front-line clinical experts to understand the experience of diagnosis and treatment to improve the comprehensiveness of quality control rules. We will continue to enrich our quality control rules so that we can evaluate the details of patient medication in the future. In addition,due to the diversity of Chinese expression and great differences in data quality between different hospitals in China, the information generated in the real clinical scenario of hospitals also requires in-depth data governance and processing (27).

This study also has some limitations. First, QCSA was tested in only one hospital and was not tested for generalizability on a large scale, and needs to be applied in many hospitals in order to improve its universality. Second, the quality of source data can greatly affect the effectiveness of quality control of PA records, but it needs to take a long time to improve the quality of source data in hospitals in China. Third, we only selected some key quality control points, so it may not fully reflect the quality of the diagnosis and treatment of the PA patients. In addition, the real-time reminder of this system depends on the timeliness of data acquisition and processing in the hospital. However, HIS, LIS, PACS, et al. of every hospital respectively were developed and operated by different manufacturers in China, which brings great difficulties to simultaneously access to real-time data from these manufacturers. The most important reason why different hospital medical record systems in China cannot be effectively interconnected is that there is no authoritative and unified terminology standard, which makes clinical descriptions too diverse and non-standardized. The differences between foreign languages and Chinese make some terminology standards such as SNOMED CT not well applied. This greatly affects data interaction and processing, increasing the difficulty of NLP. The good news is that the Chinese Health and Medical Commission is also constantly trying to launch a standardized medical terminology. Some domestic institutions, such as the Omaha medical terminology system established by the Zhejiang Digital Medical and Health Technology Research Institute, are also committed to solving the medical terminology system and standardization. It is believed that in the future, medical data will become more standardized and semantically interoperable, and NLP algorithms will be more versatile, thereby extending more intelligent application scenarios.

As a new technology, QCSA is a challenge for doctors. It is undeniable that the promotion of a new technology often encounters many problems, such as the learning time of the new technology, the required Internet equipment, and the maintenance and management of the system. Of course, the most important thing is that it also makes doctors see patients for longer. These problems are difficult but not insurmountable. For learning and technical difficulties through on-site and remote teaching can be realized quickly. As for the diagnosis time, the literature shows that the average consultation time of American doctors is more than 20 minutes, ranking second, while the average consultation time of Chinese doctors is less than 5 minutes, ranking third from the bottom among the 67 countries. For fungal lung infections, which are rare and difficult to diagnose, prolonged communication with the patient is required. How to improve the system in terms of working efficiency and benefit of patients is also our next direction.


Conclusions

In brief, we developed the QCSA based on AI and PA guidelines. It can rapidly identify PA cases with defects in diagnosis and treatment and help improve the quality of management of PA patients. In the future, more quality control points of PA will be added. We will consider applying QCSA system to multiple hospitals to improve its universality. In addition, because the guidelines are continuously updated and the diagnosis and treatment of PA will also progress, the function of personalized configuration of quality control points will be added, so that the hospital can independently update the quality control points and return the quality control right to the doctors to the maximum extent.


Acknowledgments

We would like to thank the patients, along with the nurses and clinical staff who work in the hospital respiratory medicine departments. We thank the other staff from Guangzhou Tianpeng Technology Co., Ltd. Furthermore, we would also like to thank the AJE team for polishing the English language of this manuscript.

Funding: This research was sponsored by the Independent/Open Project of State Key Laboratory of Respiratory Disease (No. SKLRD-Z-202019, No. SKLRD-OP-201913) and the National Key R&D plan (No. 2018YFC1311900).


Footnote

Reporting Checklist: The authors have completed the STARD reporting checklist. Available at https://jtd.amegroups.com/article/view/10.21037/jtd-22-532/rc

Data Sharing Statement: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-22-532/dss

Peer Review File: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-22-532/prf

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-22-532/coif). MX, YW, FX, PH report that they are employees of Guangzhou Tianpeng Technology Co., Ltd., Guangzhou, China. The other authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by Ethics Committee of the First Affiliated Hospital of Guangzhou Medical University (ethical approval number: 2018-119) and informed consent was taken from all the patients.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Amato-Gauci A, Ammon A. The First European Communicable Disease Epidemiological Report. European Centre for Disease Prevention and Control. 2007.
  2. Li Z, Li Y, Chen Y, et al. Trends of pulmonary fungal infections from 2013 to 2019: an AI-based real-world observational study in Guangzhou, China. Emerging Microbes & Infections 2021;10:450-60. [Crossref] [PubMed]
  3. Denning DW, Park S, Lass-Florl C, et al. High-frequency triazole resistance found In nonculturable Aspergillus fumigatus from lungs of patients with chronic fungal disease. Clin Infect Dis 2011;52:1123-9. [Crossref] [PubMed]
  4. Chamilos G, Luna M, Lewis RE, et al. Invasive fungal infections in patients with hematologic malignancies in a tertiary care cancer center: an autopsy study over a 15-year period (1989-2003). Haematologica 2006;91:986-9. [PubMed]
  5. Patterson TF, Thompson GR 3rd, Denning DW, et al. Practice Guidelines for the Diagnosis and Management of Aspergillosis: 2016 Update by the Infectious Diseases Society of America. Clin Infect Dis 2016;63:e1-60. [Crossref] [PubMed]
  6. Ullmann AJ, Aguado JM, Arikan-Akdagli S, et al. Diagnosis and management of Aspergillus diseases: executive summary of the 2017 ESCMID-ECMM-ERS guideline. Clin Microbiol Infect 2018;24:e1-38. [Crossref] [PubMed]
  7. Donnelly JP, Chen SC, Kauffman CARevision and Update of the Consensus Definitions of Invasive Fungal Disease From the European Organization for Research and Treatment of Cancer and the Mycoses Study Group Education and Research Consortium, et al. Clin Infect Dis 2020;71:1367-76. [Crossref] [PubMed]
  8. Niu Y, Li J, Shui W, et al. Clinical features and outcome of patients with chronic pulmonary aspergillosis in China: A retrospective, observational study. J Mycol Med 2020;30:101041. [Crossref] [PubMed]
  9. Danion F, Rouzaud C, Duréault A, et al. Why are so many cases of invasive aspergillosis missed? Med Mycol 2019;57:S94-103. [Crossref] [PubMed]
  10. Dreisbach C, Koleck TA, Bourne PE, et al. A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data. Int J Med Inform 2019;125:37-46. [Crossref] [PubMed]
  11. Liang H, Tsui BY, Ni H, et al. Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nat Med 2019;25:433-8. [Crossref] [PubMed]
  12. Wang H, Li Y, Khan SA, et al. Prediction of breast cancer distant recurrence using natural language processing and knowledge-guided convolutional neural network. Artif Intell Med 2020;110:101977. [Crossref] [PubMed]
  13. Shickel B, Tighe PJ, Bihorac A, et al. Deep HER: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (HER) Analysis. IEEE J Biomed Health Inform 2018;22:1589-604. [Crossref] [PubMed]
  14. Lyu C, Chen B, Ren Y, et al. Long short-term memory RNN for biomedical named entity recognition. BMC Bioinformatics 2017;18:462. [Crossref] [PubMed]
  15. Chowdhury S, Dong X, Qian L, et al. A multitask bi-directional RNN model for named entity recognition on Chinese electronic medical records. BMC Bioinformatics 2018;19:499. [Crossref] [PubMed]
  16. Jauregi Unanue I, Zare Borzeshi E, Piccardi M. Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition. J Biomed Inform 2017;76:102-9. [Crossref] [PubMed]
  17. Greenberg N, Bansal T, Verga P, et al. Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018.
  18. Brown GD, Denning DW, Gow NA, et al. Hidden killers: human fungal infections. Sci Transl Med 2012;4:165rv13. [Crossref] [PubMed]
  19. Denning DW, Pleuvry A, Cole DC. Global burden of chronic pulmonary aspergillosis as a sequel to pulmonary tuberculosis. Bull World Health Organ 2011;89:864-72. [Crossref] [PubMed]
  20. Denning DW, Pleuvry A, Cole DC. Global burden of allergic bronchopulmonary aspergillosis with asthma and its complication chronic pulmonary aspergillosis in adults. Med Mycol 2013;51:361-70. [Crossref] [PubMed]
  21. Weiskopf NG, Weng C. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc 2013;20:144-51. [PubMed]
  22. Datta S, Bernstam EV, Roberts K. A frame semantic overview of NLP-based information extraction for cancer-related HER notes. J Biomed Inform 2019;100:103301. [Crossref] [PubMed]
  23. Kreimeyer K, Foster M, Pandey A, et al. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review. J Biomed Inform 2017;73:14-29. [Crossref] [PubMed]
  24. Wang Z, Talburt JR, Wu N, et al. A Rule-Based Data Quality Assessment System for Electronic Health Record Data. Appl Clin Inform 2020;11:622-34. [Crossref] [PubMed]
  25. Wu S, Roberts K, Datta S, et al. Deep learning in clinical natural language processing: a methodical review. J Am Med Inform Assoc 2020;27:457-70. [Crossref] [PubMed]
  26. Smith JC, Chen Q, Denny JC, et al. Evaluation of a Novel System to Enhance Clinicians’ Recognition of Preadmission Adverse Drug Reactions. Appl Clin Inform 2018;9:313-25. [Crossref] [PubMed]
  27. Li X, Krumholz HM, Yip W, et al. Quality of primary health care in China: challenges and recommendations. Lancet 2020;395:1802-12. [Crossref] [PubMed]
Cite this article as: Li Z, Wang X, Xu M, Li Y, Wang Y, Chen Y, Li S, Li Z, Yang J, Tang C, Xiong F, Jian W, He P, Zhan Y, Zheng J, Ye F. Development and clinical application of an electronic health record quality control system for pulmonary aspergillosis based on guidelines and natural language processing technology. J Thorac Dis 2022;14(9):3398-3407. doi: 10.21037/jtd-22-532

Download Citation