Risk prediction models for lung cancer: Perspectives and dissemination

2019-05-25 07:49:16WeiTangQinPengYanzhangLyuXiaoshuangFengXinLiLuopeiWeiNiLiHongdaChenWanqingChenMinDaiNingWuJiangLiYaoHuang

Chinese Journal of Cancer Research 2019年2期

Wei Tang, Qin Peng, Yanzhang Lyu, Xiaoshuang Feng, Xin Li, Luopei Wei, Ni Li, Hongda Chen,Wanqing Chen, Min Dai, Ning Wu,3, Jiang Li, Yao Huang

1Department of Diagnostic Radiology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China;

2Office of Cancer Screening, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021,China;

3PET-CT Center, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China

Abstract Objective: The objective was to systematically assess lung cancer risk prediction models by critical evaluation of methodology, transparency and validation in order to provide a direction for future model development.Methods: Electronic searches (including PubMed, EMbase, the Cochrane Library, Web of Science, the China National Knowledge Infrastructure, Wanfang, the Chinese BioMedical Literature Database, and other official cancer websites) were completed with English and Chinese databases until April 30th, 2018. Main reported sources were input data, assumptions and sensitivity analysis. Model validation was based on statements in the publications regarding internal validation, external validation and/or cross-validation.Results: Twenty-two studies (containing 11 multiple-use and 11 single-use models) were included. Original models were developed between 2003 and 2016. Most of these were from the United States. Multivariate logistic regression was widely used to identify a model. The minimum area under the curve for each model was 0.57 and the largest was 0.87. The smallest C statistic was 0.59 and the largest 0.85. Six studies were validated by external validation and three were cross-validated. In total, 2 models had a high risk of bias, 6 models reported the most used variables were age and smoking duration, and 5 models included family history of lung cancer.Conclusions: The prediction accuracy of the models was high overall, indicating that it is feasible to use models for high-risk population prediction. However, the process of model development and reporting is not optimal with a high risk of bias. This risk affects prediction accuracy, influencing the promotion and further development of the model. In view of this, model developers need to be more attentive to bias risk control and validity verification in the development of models.

Keywords: Lung neoplasms; carcinoma; bronchogenic; risk assessment; models; theoretical

Introduction

Lung cancer is the most common cause of cancer death worldwide. In 2012, there were 1.82 million new cases,accounting for 12.9% of the total number of new cancers and 1.56 million lung cancer deaths, with lung cancer responsible for nearly 1 in 5 cancer deaths (1). In Europe,lung cancer is the most common cause of cancer death in males (267,000, 24.8%) and the second most common cause of cancer death in females (121,000 deaths, 14.2%)(2). The National Lung Screening Trial (NLST) in the United States found a 20% relative reduction in mortality of lung cancer among long-term, high-risk smokers that were screened with low-dose computed tomography(LDCT) (3). That trial suggests that screening may prevent and reduce lung cancer mortality with sensitive risk models.Hence, population screening for the early detection of lung cancer is an important part of current clinical research.

However, LDCT screening has disadvantages including radiation exposure, false positives and over diagnosis. It is therefore essential to identify the most appropriate target population to maximize screening benefits and minimize adverse effects. By preliminary assessment, screening programs for high-risk groups will improve screening efficiency as well as reduce screening costs and resource waste. In fact, the success of any screening program is directly related to high-risk group assessment (4,5) and accomplished with lung cancer prediction models (6,7). To help define the target population for lung cancer screening,some models allow calculation of individual risk for lung cancer based on previously results (8). Model prediction can improve clinical intervention and post-care development, as well as guide the selection of screening populations to promote optimal use of resources. After Bach’s study (9), research focus has been on predictive models of lung cancer. Current models have good sensitivity and specificity and were based on traditional variables, biomarkers, LDCT and data mining techniques.The objective of this study was to evaluate prediction models for lung cancer high-risk groups in order to provide a direction for further model development.

Materials and methods

Search strategies and eligibility criteria

A systematic literature search was performed with both English and Chinese databases including EMbase,PubMed, Web of Science, the Cochrane Library, Chinese BioMedical Literature Database (CBM), WanFang Data,and the China National Knowledge Infrastructure (CNKI).The search used a combination of subject mesh terms and free words. Search terms included lung neoplasms, lung cancer, mass screening, early detection of cancer, risk factors, high-risk population, high-risk group, high-risky population, decision support techniques, prediction model and forecast model. A search strategy in PubMed is listed below as an example:

#1 “l(fā)ung Neoplasms”[MeSH] OR “l(fā)ung Neoplasms”[Title/Abstract] OR “l(fā)ung cancer”[Title/Abstract]

#2 “Mass Screening”[MeSH] OR “Early Detection of Cancer”[MeSH] OR “Screening”[Title/Abstract]

#3 “high risk”[Title/Abstract]

#4 “decision support techniques”[MeSH] OR “prediction model” [Title/Abstract] OR “forecast model”[Title/Abstract]

#5 #1 AND #2 AND #3 AND #4

The inclusion criteria were: 1) lung cancer screening; 2)high-risk population prediction model; and 3) report validity and model’s statistical method, etc. Literature exclusion criteria were: 1) non-Chinese, non-English, and documents that do not have full text; 2) not related to lung cancer screening or early diagnosis of lung cancer; 3)repeated publications; 4) review and other secondary research literature; 5) conference summary; or 6) patented technology.

Selection of eligible studies and data extraction

Two researchers independently conducted literature screening, data extraction and cross-checking. If disagreements occurred, the two researchers would discuss a solution or submit the disagreement to a third researcher for discussion. If information could not be extracted from an article, the researchers contacted the original author for clarification. When reading the literature, the researchers read the title and abstract first to exclude apparently unrelated literature, and then read the complete text to determine inclusion. Data extraction content mainly included: 1) basic information such as publication year,country or region, research design type, model’s statistical method, crowd information, modeling sample, area under the receiver-operating characteristic curve (AUC) and concordance index (C-index); 2) model transparency information, inclusion variables, expressions, limitations,financial support, conflicts of interest and validity evaluation methods; 3) model risk of bias, including blind method, data bias risk, sensitivity analysis of uncertainty variables, whether the model was calibrated, and external validity; 4) variables included in each model: sociodemographic, exposure history, smoking history, medical history, family history and genetic risk factors; 5) model validity evaluation content including internal validity,cross-validity and external validity; and 6) basic information of single-use models.

Framework for qualitative assessment of multiple-use models

In this study, models were divided into multiple-use and single-use. The model description, transparency and risk of bias assessment were used for multiple-use models. Model descriptions included model publication date, country or region, study type, model’s statistical method, population information, modeling samples, model samples and model accuracy (AUC or C-index).

Transparency mainly evaluates the degree of disclosure of specific information by the model. Improving the transparency of the model promotes the use of the model by exposing the model development process, statistical methods, inclusion parameters, model structure and other pertinent information for the user (7). Herein, this study conducted a transparency evaluation of the inclusion variables, expressions, limitations, financial support and conflicts of interest for each model.

Validity directly reflects the accuracy of the model in realistic prediction and is also an important criterion for actual application of the model. This study evaluated the internal validity, intersection angle and external validity of the included models. Internal validity detects the standardization of mathematical methods and models in the process of model construction. Through multiple data training, it avoids unintentional calculation errors and improves the internal accuracy of the model. Crossvalidation identifies how different models solve the same problem. External validity aligns the model to actual data and investigates its predictive accuracy. Validity evaluation should be compared and completed (10,11).

Risk of bias assessment was based on the Mcginn checklist (12) and the results of Jamie’s study (13). A checklist for model risk of bias assessment was developed and blinded from outcome evaluation by the predictive factor blind method. In this manner, sensitivity analysis of the variables was determined when the model had been calibrated. The five dimensions of external validity were used to evaluate the risk of bias for clinical prediction tools,and the study was rated as high, moderate, or low risk of bias. Studies with a high risk of bias had a fatal flaw that made their results very uncertain. Studies with a low risk of bias met all criteria, making their results more certain.Studies that did not meet all criteria but had no fatal flaw(thus making their results somewhat uncertain) were rated as having a moderate risk of bias (Table 1).

Results

Basic information

A total of 11 models that were used multiple times were included in this study (Figure 1). Three of those were derived versions. The earliest model was the Bach model published in 2003. The largest number of published models was from the United States, with the remaining from the United Kingdom and Canada. These models were based on case-control studies (six studies) and cohort studies (five studies). Statistically, most of the studies used logistic regression, while three of the models used Cox regression.

Two of the models included racial factors. The other models were mainly limited by age and smoking history.The youngest individual was 20 years old and the oldest 80 years old. Most individuals were 50-75 years old. The definition of smoking history was defined as never smoker,former smoker and current smoker (Table 2).

Modeling samples ranged from 594 to 70,962. The accuracy of the model was measured by AUC or C statistic.According to the summary results, the minimum AUC of each model was 0.57 and the largest was 0.87. The smallest C statistic was 0.59 and the largest was 0.85.

Table 1Framework for quality assessment of multiple-use models

Figure 1 Flowchart of screening result.

Transparency

The models included in this study listed inclusion variables,but only two models listed the model’s expressions. The limitations of each model were primarily uncommon population assessed by the model, the lack of good external validity verification and the inability of the model to assess an individual’s lung cancer risk. The model research was supported by national and regional projects, or by public welfare funds such as the Lung Cancer Foundation. Only four studies reported no conflicts of interest with no other studies reporting relevant content. Six studies were validated through external validation and three were crossvalidated (Table 3).

Risk of bias

Two of the included models had a high risk of bias and the remaining nine were of moderate risk. Sensitivity analysis of uncertain variables was not performed for all models,with only one model blinded by predictive factors and outcome evaluations during development. It is worth noting that six models were calibrated after development,making the risk of bias moderate (Table 4).

Table 2 Characteristics of multiple-use models

Table 3Transparency assessment of multiple-use models

Validity

Model internal validity design is used to develop data,perform repeated operations and verify consistency of results. Three models were repeated by the bootstrap method, one study was re-verified using a partial sample,and one study used five similar research data sets to perform internal validation of the model. Regarding crossvalidity, two articles were verified 10-fold and one article 3-fold. Only six studies were externally validated. Sample size varied with a maximum of 44,233 cases and a minimum of 325 cases (Table 5).

Table 4Risk of bias assessment of multiple-use models

Inclusion of variables

According to the statistical results, the variables included in the models were comprised of six aspects:sociodemographic factors, exposure history, smoking history, medical history, family history and genetic risk factors. The most used variables were age and smoking duration by 6 models, and 5 models included family history of lung cancer (Table 6).

Single-use models

The single-use models were mostly from China, with two from the United States and one from Germany. The types of studies were either cohort or case-control, with most studies from China case-control. Statistical methods were diverse. In addition to Logistic and Cox regression analysis,data mining techniques such as artificial neural network,artificial neural network, support vector machine, decision tree, support vector machine and Fisher discriminant analysis were employed. In addition to the above variables,tumor markers, gene loci and psychological factors emerged, providing a valuable reference for model prediction. A large amount of data was extracted from established samples with the smallest sample size a total of 114 cases. Prediction accuracy and validity evaluation were not disclosed by some studies (Table 7).

Table 5Validation and samples of multiple-use models

Table 6Variables of multiple-use models

Discussion

This study included 11 multiple-use models and 17 singleuse models. Models used multiple times were developed by European and American countries. In essence, a large number of models were based on large-scale national projects, such as the NLST (multicenter randomized controlled trial, 53,456 samples) (41), Liverpool Lung Project (LLP, case-control study: 800 cases and 400 controls, cohort study: 7,500 samples) (42), and the Prostate, Lung, Colorectal and Ovarian (PLCO,multicenter randomized controlled trial, 74,000 samples)(43). These projects provided model development based on a large quantity of detailed data. Most studies were casecontrol and cohort, which are convenient for model construction.

A model that can be used multiple times is also a model that can be updated. Four studies incorporated a model that was used to derive subsequent models, which were supplements and adjustments to the previous model. These updated models differ from the previous models. The difference between the previous and the updated version was the scope of the population even though the analysis was the same.

Since the development of the Bach model, many studies have focused on the form of predictive models. Predictive models have been highly valued by the academic community in recent years, and gradually, based on the Bach model, risk factor enrichment has increased. Some predictive models included parameters like tumor markers and genes, which have accelerated model development.Variables now include more basic information and family history, which eliminate the need for traditional factors when combined with single-use models. By the use of new medical information technologies, the accuracy of models has improved.

Transparency is of significance to the promotion and application of models. Through dual disclosure of technical documents and non-technical articles, the user can understand the model’s developmental process, providing application instruction and guidance (10). The multi-use models included in this study have relatively good transparency, although most cited literature does not report expressions of the model. The expression of the model has significance for model popularization. If the variables included in a model were reported, it would be possible for others to consider and weigh the importance of the variables in model prediction. In addition, some studies did not report relevant conflicts of interest, which does not insure the independence of the model.

The existence of bias makes the accuracy of model prediction difficult to assess and can distort the importance of influence on prediction results. There are many forms of biases in the development of a model including research design, field survey, data entry and data analysis, which in turn affect the predictive accuracy of the model. There are many tools for bias evaluation such as the Cochrane tool for randomized control trial (RCT) (44), QUADAS for diagnostic test studies (45), the Newcastle-Ottawa Scale(NOS) scale for cohort studies and case-control studies(46), and systematic review AMSTAR (47). The bias evaluation tool for model development is still immature.This study has developed a bias evaluation checklist based on related research, and found that the risk of bias in lung cancer prediction models is high. The main problem for sensitivity analysis is the lack of a blinding method and variable uncertainty. The absence of blinding may interfere with subjective thinking of the researcher. Sensitivity analysis of uncertain variables is an important step in the refinement of the variables and the main method to improve the validity of the model. Calibration increases the risk of bias in the model’s predictions.

Some models lack verification of external validity.Validation should be ongoing for a model (48). Conducting validation throughout the modeling process is essential in that mistakes can be found and corrected at an early stage of model development. Late validation leaves little time to remedy any issues. The likelihood of finding mistakes increases with the number of validation rounds, minimizing the chance that the model will contain serious errors. For all models, the validation process and its results should be reported. External verification works by comparison of the model’s results with data derived from actual events and by comparison of results. External validity is critical to model development in that the ultimate goal of the model is the application to practice to ensure that best choices are made(7). However, only six of the included studies were externally validated. Although the other studies performed validation (internal validation or cross-validation), these are not adequate for predictive models. A new evaluation model of 2 million high-risk individuals from the Cancer Screening in Urban China Program is being built based on this study. It will integrate analytics including validity, bias and other involved factors that will be applied to this future research project.

Table 7Single-use models for lung cancer prediction

Conclusions

This study considers risk prediction models for high-risk lung cancer populations. It rigorously evaluated multipleuse models for transparency, risk of bias and variables.Various models have been developed for different types of populations and were used to predict lung cancer risk based on various conditions (e.g. age and smoking status). The prediction accuracy of the models was high overall,indicating that it is feasible to use models for high-risk population prediction. However, the process of model development and report is not optimal in that the models have a high risk of bias, affecting credibility and predictive accuracy, which influences the promotion and further development of the model. In view of this, model developers need to be more attentive to bias risk control and validity verification in the development of models.

Acknowledgements

This study is supported by National Key R&D Program of China (No. 2017YFC1308700), National Natural Science Foundation of China (No. 81602930), and Chinese Academy of Medical Sciences Initiative for Innovative Medicine (No. 2017-I2M-1-005).

Footnote

Conflicts of Interest: The authors have no conflicts of interest to declare.

Chinese Journal of Cancer Research2019年2期

Chinese Journal of Cancer Research的其它文章: Extensive exploration of T cell heterogeneity in cancers by single cell sequencing; Synthesis and evaluation of 64Cu-radiolabeled NOTA-cetuximab(64Cu-NOTA-C225) for immuno-PET imaging of EGFR expression; Nomograms based on HPV load for predicting survival in cervical squamous cell carcinoma: An observational study with a longterm follow-up; Alisol B 23-acetate-induced HepG2 hepatoma cell death through mTOR signaling-initiated G1 cell cycle arrest and apoptosis: A quantitative proteomic study; Histogram analysis of apparent diffusion coefficient predicts response to radiofrequency ablation in hepatocellular carcinoma; Efficacy of endoscopic treatment on patients with severe dysplasia/carcinoma in situ of esophageal squamous cell carcinoma: A prospective cohort study