To improve the prognosis of lung cancer, timely and accurate diagnosis is crucial. Currently, the gold standard for lung cancer diagnosis is biopsy guided by thoracoscopy, bronchoscopy or CT. The major disadvantages of these tools are invasiveness and high cost. In addition, the accuracy of these diagnostic tools is greatly affected by the experience of operators and observers (1). Therefore, it is of great value to develop non-invasive and low-cost tools to detect lung cancer, such as blood tumor markers (2).
During the past decades, several blood tumor markers have been identified for lung cancer diagnosis, such as progastrin-releasing peptide (ProGRP) (3), cytokeratin 19-fragments (CYFRA 21.1) (4) and carcinoma embryonic antigen (CEA) (5). However, the sensitivity and specificity of these tumor markers are far from satisfactory. It seems that multiple tumor markers strategy represents an effective tool for lung cancer diagnosis (6-8). Therefore, developing and evaluating novel tumor markers is promptly needed.
Human epididymis secretory protein 4 (HE4) has been regarded as a tumor marker for ovarian cancer for a long time (9,10). Interestingly, several studies have revealed that it is also a useful diagnostic marker for lung cancer (11-13), but the results of these studies are heterogeneous. Therefore, we performed a systematic review and meta-analysis to assess the diagnostic accuracy of HE4 for lung cancer.
Databases used for literature searching
This systematic review and meta-analysis was conducted following the PRISMA-DTA (Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies) guidelines (14) (Tables S1,S2). Three databases, including the PubMed, EMBASE and Web of Science, were searched up to January 1, 2019 to identify eligible studies. The search algorithm in PubMed was: (HE4 OR "Human Epididymis Protein 4" OR "WFDC2 protein, human"[nm]) and ("Lung Neoplasms"[mesh] OR "lung cancer" OR "lung carcinoma*" OR "lung tumor" OR "lung neoplasm*" OR "malignant lung disease*"). Similar search strategy was used for EMBASE and Web of Science. In addition, all references listed in eligible studies were also manually searched.
All retrieved studies were imported into Endnote, a widely-used literature management software, to remove duplicate publications. Two investigators independently reviewed the titles and abstracts of the retrieved studies to verify their eligibility. The inclusion criteria were: (I) studies investigating the diagnostic accuracy of blood HE4 for lung cancer; (II) both sensitivity and specificity were available to construct a two-by-two table. The exclusion criteria were: (I) animal studies; (II) non-English published studies; (III) studies with sample sizes less than 10; (IV) case reports, conference abstracts and letter to the editors. For duplicate studies, only the study with sufficient information or larger sample size was included. All retrieved studies were independently screened by two reviewers and any discrepancies were resolved by consensus and full-text reviewing.
Quality assessment and data extraction
We extracted following data from the included studies: name of the first author; publication year, sources of the subjects, HE4 assays, reference standard for lung cancer diagnosis, sample sizes of lung cancer and control, threshold and its corresponding sensitivity and specificity, area under receiver operating characteristics (ROC) curve (AUC) and characteristics of the control. Two-by-two tables were constructed with sensitivity, specificity, sample sizes of lung cancer and control in each eligible study. The formulas used to construct the two-by-two table were: true positive (TP) = number of lung cancer patients × sensitivity; true negative (TN) = number of control × specificity; false negative (FN) = number of lung cancer patients × (1− sensitivity); false positive (FP) = number of control × (1−specificity). In studies with healthy individuals and benign lung diseases (BLDs) as the control, if the healthy individuals could be removed from final analysis, we constructed the two-by-two tables with BLDs only.
The quality of eligible studies was assessed by the revised Quality Assessment for Studies of Diagnostic Accuracy tool (QUADAS-2) (15). Any discrepancies in quality assessment and data extraction were resolved by consensus.
The pooled sensitivity and specificity of HE4 were calculated using a bivariate model (16). A summary ROC (sROC) curve was used to estimate the overall diagnostic accuracy of HE4 (17). A funnel plots and the Deeks’s test were applied to assess the potential publication bias (18). Subgroup analysis was performed to explore the sources of variability. We used the Stata 13.0 (Stata Corp LP, College Station, TX, USA) with the midas command to perform all statistical analyses. Review Manager 5.3 was used to synthesize forest plots.
Summary of eligible studies
Figure S1 is a flowchart depicting the study selecting process. Finally, 16 studies with 3,202 subjects (1,756 lung cancers and 1,446 controls) were identified (8,12,13,19-31). The studies performed by Yoon et al. (29) and Hertlein et al. (23) enrolled two cohorts; therefore, a total of 18 cohorts were included in this systematic review. The characteristics of these studies were summarized in Table 1. Five of the included studies were performed in China (20,21,25,27,30), four were in Turkey (8,12,19,26), two were in Korea (28,29), two were in Japan (22,24). The remaining studies were performed in Hungary (13), Poland (31) and Germany (23). Chemiluminescent immunoassay (CMIA) developed by Architect was used in eight studies (8,12,13,23,26-28,31), and enzyme immunoassay (EIA) developed by Fujirebio was used in six studies (19-22,24,29). Two studies used electrochemiluminescence immunoassay (ECLIA) developed by Roche (25,30). The controls in included studies were various, including healthy individuals (13,20,24,29-31), BLDs (12,23,28), healthy individuals and BLDs (8,19,22,25-27) and tuberculosis (21). Only one study was industry funded (28).
Figure S2 depicts the quality of included studies. Generally, the quality of the included studies was poor. Patient selection and flow and timing domains of some included studies were labeled as high bias because they used healthy individuals as control. Flow and timing domain of some studies were labeled as unclear because the partial verification bias was not reported. Reference domain of some studies was labeled as unclear because criteria used for lung cancer diagnosis were not reported.
Main findings of included studies and meta-analysis
Table 2 summarizes the main findings of the eligible studies. The AUCs of HE4 in the eligible studies ranged from 0.61 to 0.99. The thresholds used in majority of the eligible studies was around 60 to 100 pmol/L. The sensitivities ranged from 0.12 to 0.90, and specificities ranged from 0.57 to 1.00.
Figure 1 is a forest plot depicting the diagnostic accuracy of HE4 for lung cancer. The pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR) and diagnostic odds ratio (DOR) of HE4 were 0.65 (95% CI: 0.54–0.75), 0.88 (95% CI: 0.82–0.92), 5.3 (95% CI: 3.7–7.6), 0.40 (95% CI: 0.30–0.52) and 13 (95% CI: 8–21), respectively. Great variability (0.99, 95% CI: 0.98–0.99) was observed among eligible studies.
Figure 2 is a sROC plot for HE4, with an AUC of 0.86 (95% CI: 0.82–0.88).
Considering that great variability was identified among eligible studies and only 37% of them was likely due to threshold effect, we performed a subgroup analysis. The results of subgroup analysis are listed in Table 3. The sensitivity and specificity were not greatly affected by the HE4 test assay and participant sources; however, they were greatly affected by the characteristics of controls. The studies with healthy control had obviously higher AUC than those with BLDs. In the subgroup with EIA assay (Fujirebio), all of the variability could be explained by threshold effect. In addition, in the subgroup with BLD as control, a large portion (83%) of variability could be explained by threshold effect. Taken together, these results indicate that HE4 test assay and control’s characteristics are the potential source of variability.
Funnel plot indicated that publication bias was not statistically significant (P=0.97, Figure 3).
The major findings of present systematic review and meta-analysis are: (I) HE4 had a moderate diagnostic accuracy for lung cancer, with a sensitivity of 0.65 (95% CI: 0.54–0.75), a specificity of 0.88 (95% CI: 0.82–0.92) and an AUC of 0.86 (95% CI: 0.82–0.88) at the threshold between 60 and 100 pmol/L; (II) the quality of available studies were poor because of patient selection bias and partial verification bias; (III) there was no significant publication bias among available studies.
To date, only one study has investigated the diagnostic accuracy of HE4 for lung cancer using meta-analysis (11). Compared with that study, our study has strengths. First, the number of included studies and the overall sample size in our meta-analysis are larger. Therefore, the statistical power of our study is higher. Second, we used a bivariate model to pool the diagnostic accuracy of HE4 while the previous study used a random-effects model with the Meta-Disc software (version 1.4). In the random-effects model, sensitivity and specificity are pooled separately and the trade-off between them is ignored (32). While the bivariate model uses the combination of specificity and sensitivity as the starting point of the analysis (16,33). Therefore, it represents a more reliable method to estimate the diagnostic accuracy of HE4. Third, we explored the sources of variability and found that test assay and characteristics of controls were the potential sources. Fourth, we performed a subgroup analysis and found that using healthy individuals as a control can bias the diagnostic accuracy of HE4.
Sensitivity and specificity are two important characteristics of an index test; however, they have two limitations. The first limitation is that they are greatly affected by the threshold used to define positive and negative results (34,35). By contrast, AUC of sROC is not affected by threshold and thus represents a globe measure of the diagnostic accuracy (17,36). In this meta-analysis, the AUC of HE4 was 0.86 (95% CI: 0.82–0.88), indicating that HE4 has moderate diagnostic accuracy for lung cancer. Another limitation of sensitivity and specificity are that they are not easy to interpret. By contrast, PLR and NLR are considered more clinically meaningful because both pre-test and post-test probabilities are considered (34,37-39). PLR >10 or NLR<0.1 are considered to provide strong evidence to rule in or rule out diagnosis respectively (38). In this meta-analysis, we found the PLR and NLR were 5.3 (95% CI: 3.7–7.6) and 0.40 (95% CI: 0.30–0.52), respectively. These results indicate that HE4, when used alone, is insufficient to rule in or rule out lung cancer, and the serum HE4 concentration should be interpreted in parallel with other clinical findings.
Currently, the diagnosis and classification of lung cancer are based on biopsy guided by thoracoscopy, bronchoscopy or CT. The major limitation of biopsy is that can cause some complications such as infection and bleeding. Therefore, the potential benefit and harm of biopsy should be fully considered before performing biopsy. Previous studies have indicated that HE4 has moderate diagnostic accuracy for lung cancer. However, it should be noted that previous studies only reported the diagnostic characteristics (e.g., sensitivity, specificity, PLR and NLR) at a special threshold. These characteristics, although have been widely used to measure the diagnostic accuracy of an index test, do not incorporate information on consequences. During the past years, decision curve analysis (DCA) (40,41) has been widely used to estimate the net benefit of test for a target disease. To present, none of the studies has used the DCA to estimate the net benefit of HE4 detection for lung cancer. Therefore, further studies with DCA are needed to assess the net benefit of HE4 detection.
The major limitation of this work was that a large portion of included studies has design weaknesses, which might negatively affect the reliability of this meta-analysis. The major design weakness of eligible studies was patient selection bias. All of the included studies did not report the pre-designed inclusion and exclusion criteria, and whether the subjects were enrolled consecutively or randomly was not reported. In other words, all of the included studies were “two-gate” design studies (42). This type of study design may overestimate the diagnostic accuracy of the index test because the studied subjects only represent those who are easy to diagnosis (43-45). Therefore, the conclusions of these studies should be cautiously generalized to other clinical settings. Some diagnostic metrics, such as positive predictive value (PPV) and negative predictive value (NPV), are greatly affected the prevalence of the target disease in the studied cohort (46). These metrics may not be generalized to clinical practice unless the inclusion and exclusion criteria are clearly defined.
In conclusion, our meta-analysis reveals that HE4 seems to be a useful diagnostic marker for lung cancer. Because the currently available studies have study design weakness, especially the patient selection bias, further studies with rigorous design are needed to evaluate the diagnostic accuracy of HE4 for lung cancer.
Funding: This work was supported by a grant from the National Natural Science Foundation of China (Grant Number 81860501). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Conflicts of Interest: The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
- Farago AF, Keane FK. Current standards for clinical management of small cell lung cancer. Transl Lung Cancer Res 2018;7:69-79. [Crossref] [PubMed]
- Velcheti V, Pennell NA. Non-invasive diagnostic platforms in management of non-small cell lung cancer: opportunities and challenges. Ann Transl Med 2017;5:378. [Crossref] [PubMed]
- Yang H, Gu Y, Chen C, et al. Diagnostic value of pro-gastrin-releasing peptide for small cell lung cancer: a meta-analysis. Clin Chem Lab Med 2011;49:1039-46. [Crossref] [PubMed]
- Cui C, Sun X, Zhang J, et al. The value of serum Cyfra21-1 as a biomarker in the diagnosis of patients with non-small cell lung cancer: a meta-analysis. J Cancer Res Ther 2014;10 Suppl:C131-4. [Crossref] [PubMed]
- Okamura K, Takayama K, Izumi M, et al. Diagnostic value of CEA and CYFRA 21-1 tumor markers in primary lung cancer. Lung Cancer 2013;80:45-9. [Crossref] [PubMed]
- Qi W, Li X, Kang J. Advances in the study of serum tumor markers of lung cancer. J Cancer Res Ther 2014;10 Suppl:C95-101. [Crossref] [PubMed]
- Du Q, Yan C, Wu SG, et al. Development and validation of a novel diagnostic nomogram model based on tumor markers for assessing cancer risk of pulmonary lesions: A multicenter study in Chinese population. Cancer Lett 2018;420:236-41. [Crossref] [PubMed]
- Korkmaz ET, Koksal D, Aksu F, et al. Triple test with tumor markers CYFRA 21.1, HE4, and ProGRP might contribute to diagnosis and subtyping of lung cancer. Clin Biochem 2018;58:15-9. [Crossref] [PubMed]
- Li F, Tie R, Chang K, et al. Does risk for ovarian malignancy algorithm excel human epididymis protein 4 and ca125 in predicting epithelial ovarian cancer: A meta-analysis. BMC Cancer 2012;12:258. [Crossref] [PubMed]
- Ferraro S, Panteghini M. Making new biomarkers a reality: The case of serum human epididymis protein 4. Clin Chem Lab Med 2018. [Epub ahead of print]. [Crossref] [PubMed]
- Cheng D, Sun Y, He H. The diagnostic accuracy of HE4 in lung cancer: a meta-analysis. Dis Markers 2015;2015:352670. [Crossref] [PubMed]
- Dikmen E, Gungor A, Dikmen ZG, et al. Diagnostic efficiency of he4 and cyfra 21-1 in patients with lung cancer. Int J Hematol Oncol 2015;25:44-50. [Crossref]
- Nagy B, Bhattoa HP, Steiber Z, et al. Serum human epididymis protein 4 (HE4) as a tumor marker in men with lung cancer. Clin Chem Lab Med 2014;52:1639-48. [Crossref] [PubMed]
- McInnes MDF, Moher D, Thombs BD, et al. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement. JAMA 2018;319:388-96. [Crossref] [PubMed]
- Whiting PF, Rutjes AW, Westwood ME, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011;155:529-36. [Crossref] [PubMed]
- Reitsma JB, Glas AS, Rutjes AW, et al. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J Clin Epidemiol 2005;58:982-90. [Crossref] [PubMed]
- Walter SD. Properties of the summary receiver operating characteristic (SROC) curve for diagnostic test data. Stat Med 2002;21:1237-56. [Crossref] [PubMed]
- Deeks JJ, Macaskill P, Irwig L. The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. J Clin Epidemiol 2005;58:882-93. [Crossref] [PubMed]
- Ucar EY, Ozkaya AL, Araz O, et al. Serum and bronchial aspiration fluid HE-4 levels in lung cancer. Tumour Biol 2014;35:8795-9. [Crossref] [PubMed]
- Wang X, Fan Y, Wang J, et al. Evaluating the expression and diagnostic value of human epididymis protein 4 (HE4) in small cell lung cancer. Tumour Biol 2014;35:6847-53. [Crossref] [PubMed]
- Liu W, Yang J, Chi PD, et al. Evaluating the clinical significance of serum HE4 levels in lung cancer and pulmonary tuberculosis. Int J Tuberc lung Dis 2013;17:1346-53. [Crossref] [PubMed]
- Yamashita S, Tokuishi K, Moroga T, et al. Serum level of HE4 is closely associated with pulmonary adenocarcinoma progression. Tumour Biol 2012;33:2365-70. [Crossref] [PubMed]
- Hertlein L, Stieber P, Kirschenhofer A, et al. Human epididymis protein 4 (HE4) in benign and malignant diseases. Clin Chem Lab Med 2012;50:2181-8. [Crossref] [PubMed]
- Iwahori K, Suzuki H, Kishi Y, et al. Serum HE4 as a diagnostic and prognostic marker for lung cancer. Tumour Biol 2012;33:1141-9. [Crossref] [PubMed]
- Mo D, He F. Serum Human Epididymis Secretory Protein 4 (HE4) is a Potential Prognostic Biomarker in Non-Small Cell Lung Cancer. Clin Lab 2018;64:1421-8. [Crossref] [PubMed]
- Kumbasar U, Dikmen ZG, Yilmaz Y, et al. Serum Human Epididymis Protein 4 (HE4) As A Diagnostic and Follow-Up Biomarker in Patients With Non-Small Cell Lung Cancer. Int J Hematol Oncol 2017;27:137-42. [Crossref]
- Huang W, Wu S, Lin Z, et al. Evaluation of HE4 in the Diagnosis and Follow Up of Non-Small Cell Lung Cancers. Clin Lab 2017;63:461-7. [Crossref] [PubMed]
- Choi SI, Jang MA, Jeon BR, et al. Clinical Usefulness of Human Epididymis Protein 4 in Lung Cancer. Ann Lab Med 2017;37:526-30. [Crossref] [PubMed]
- Yoon HI, Kwon OR, Kang KN, et al. Diagnostic Value of Combining Tumor and Inflammatory Markers in Lung Cancer. J Cancer Prev 2016;21:187-93. Erratum in: Erratum: Diagnostic Value of Combining Tumor and Inflammatory Markers in Lung Cancer. [J Cancer Prev 2016]. [Crossref] [PubMed]
- Zeng Q, Liu M, Zhou N, et al. Serum human epididymis protein 4 (HE4) may be a better tumor marker in early lung cancer. Clin Chim acta 2016;455:102-6. [Crossref] [PubMed]
- Wojcik E, Tarapacz J, Rychlik U, et al. Human Epididymis Protein 4 (HE4) in Patients with Small-Cell Lung Cancer. Clin Lab 2016;62:1625-32. [Crossref] [PubMed]
- Zamora J, Abraira V, Muriel A, et al. Meta-DiSc: a software for meta-analysis of test accuracy data. BMC Med Res Methodol 2006;6:31. [Crossref] [PubMed]
- Hu ZD, Wei TT, Yang M, et al. Diagnostic value of osteopontin in ovarian cancer: Meta-analysis and systematic review. PLoS One 2015;10:e0126444. [Crossref] [PubMed]
- Linnet K, Bossuyt PM, Moons KG, et al. Quantifying the Accuracy of a Diagnostic Test or Marker. Clin Chem 2012;58:1292-301. [Crossref] [PubMed]
- Dickie GL. Statistical notes. Defining sensitivity and specificity. BMJ 1994;309:539. [Crossref] [PubMed]
- Reitsma JB, Moons KG, Bossuyt PM, et al. Systematic Reviews of Studies Quantifying the Accuracy of Diagnostic Tests and Markers. Clin Chem 2012;58:1534-45. [Crossref] [PubMed]
- Zhou Q, Ye ZJ, Su Y, et al. Diagnostic value of N-terminal pro-brain natriuretic peptide for pleural effusion due to heart failure: a meta-analysis. Heart 2010;96:1207-11. [Crossref] [PubMed]
- Deeks JJ, Altman DG. Diagnostic tests 4: likelihood ratios. BMJ 2004;329:168-9. [Crossref] [PubMed]
- Altman DG, Bland JM. Diagnostic tests. 1: Sensitivity and specificity. BMJ 1994;308:1552. [Crossref] [PubMed]
- Vickers AJ, Cronin AM, Elkin EB, et al. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Med Inform Decis Mak 2008;8:53. [Crossref] [PubMed]
- Zhang Z, Rousson V, Lee WC, et al. Decision curve analysis: a technical note. Ann Transl Med 2018;6:308. [Crossref] [PubMed]
- Rutjes AW, Reitsma JB, Vandenbroucke JP, et al. Case-control and two-gate designs in diagnostic accuracy studies. Clin Chem 2005;51:1335-41. [Crossref] [PubMed]
- Lijmer JG, Mol BW, Heisterkamp S, et al. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA 1999;282:1061-6. [Crossref] [PubMed]
- Schmidt RL, Factor RE. Understanding sources of bias in diagnostic accuracy studies. Arch Pathol Lab Med 2013;137:558-65. [Crossref] [PubMed]
- Whiting P, Rutjes AW, Reitsma JB, et al. Sources of variation and bias in studies of diagnostic accuracy: a systematic review. Ann Intern Med 2004;140:189-202. [Crossref] [PubMed]
- Hu ZD. Circulating biomarker for malignant pleural mesothelioma diagnosis: pay attention to study design. J Thorac Dis 2016;8:2674-6. [Crossref] [PubMed]