Dai Jun-cheng, Hu Zhi-bin, Chen Yi-jiang, Xu Ling, Ma Hong-xia, Jin Guang-fu,Shen Hong-bing
Due to continuous increase in the prevalence of smoking, lung cancer incidence rate in China has significantly increased in both urban and rural areas in the last two decades[1-2]. It was estimated that approximately 90% of lung cancer cases were caused by tobacco smoking[3]; however, only a small fraction of the smokers (usually<20%) developed lung cancer, suggesting an individual susceptibility to lung cancer.
DNA repair, a central defense system by which cells cope with different DNA damages, maintains the integrity of the human genome, and faulty DNA repair confers to individual cancer susceptibility[4]. Among the DNA repair pathways, NER is the major repair pathway for removing DNA damage caused by tobacco smoke and deals with a wide class of helix-distorting lesions that interfere with base pairing and obstruct replication and transcription[5-6]. It was suggested that genetic polymorphisms in NER genes may be the underlying molecular mechanisms for the inter-individual variation of DNA repair capacity (DRC) associated with risk of smoking-related lung cancer[7-8].
There are accumulating evidences that polymorphisms in the NER genes may contribute to genetic susceptibility to lung cancer[9-10]. However, most of the previous studies were designed to analyze single locus or single gene and lacked statistical power because of the limited sample size of the study populations. To comprehensively investigate the roles of the SNPs in the NER pathway in the development of smoking-related lung cancer, we conducted a case-control study of 1 010 incident lung cancer cases and 1 011 cancer-free controls in a Chinese population.
2.1StudyPopulationDetailed descriptions of study design and the subjects recruitment were provided elsewhere[11]. Briefly, a total of 1 299 cases with histopathologically confirmed lung cancer in the hospitals during the study period were recruited, of whom 1 046 patients consented to participate in the study and provide blood samples, resulting in a response rate of 80.5% (1 046/1 299). The control subjects consisted of outpatients of diseases other than cancer in other departments of the same hospital during the same time period when the cases were recruited. All the control subjects were frequency-matched to the cases on age (±5 years), sex, and residential area (urban and rural areas), and the response rate was 85.6% (1 065/1 244).
The study was approved by the institutional review boards of Nanjing Medical University, Fudan University, and Tongji Medical College of Huazhong University of Science and Technology.
2.2PolymorphismsSelectionA greedy algorithm was used to choose the tagSNPs given a minimal LD parameter r2 threshold 0.1 according to the SNPs density in different genes[12]. Based on EGP database [http://egp.gs.washington.edu finished_genes.html], 38 tagSNPs in the 8 candidate NER genes were selected (ERCC1, ERCC2/XPD, ERCC3/XPB, ERCC4/XPF, ERCC5/XPG, DDB2, XPA, XPC, Table 2). In addition, we incorporated two newly identified variants from our previous SNPs screening project in the Chinese population (ERCC1-17172 and XPA-1796) into the current association study. Therefore, a total of 40 polymorphisms in 8 candidate NER genes were genotyped in 1 046 lung cancer patients and 1 065 cancer-free controls.
2.3LaboratoryAssaysGenotyping was performed by the 5'-nuclease (TaqMan) assay, using the ABI PRISM 7900HT Sequence Detection System (Applied Biosystems, Foster City, CA), in 384-well format, in Chinese National Human Genome Center at Shanghai, China. The TaqMan primers and probes were designed using the Primer Express Oligo Design software v2.0 (ABI PRISM) and available upon request. PCR reactions were carried out in a reaction volume of 5 μL containing 5ng DNA, 2.5 μL 2x TaqMan Universal PCR Master Mix,No AmpErase UNG (Applied Biosystems), 0.083 μL 40x Assay Mix. PCR reaction condition: included 95℃ for 10 minutes followed by 20 cycles of 15 seconds at 92℃ and 1 minute at 60℃ followed by 30 cycles of 15 seconds at 89℃ and 1.5 minute at 60℃. Two blank controls (water) and two duplicated samples in each 384-well format were used for quality control procedure. The intensity of each SNP should meet the criteria of three clear clusters in two scales generated by SDS software (ABI). However, samples from 36 cases and 54 controls failed in genotyping for most of the loci due to DNA quality, and these samples were thus excluded from further analyses. Therefore, 1 010 lung cancer cases and 1 011 controls were included in the final analyses.
2.4StatisticalAnalysesDifferences in selected demographic variables, smoking status, pack-years smoked and family history of cancer between the cases and controls were evaluated by using the (2 test. The associations between variants in the NER genes and lung cancer risk were estimated by computing odds ratios (ORs) and 95% confidence intervals (CIs) from the multivariate logistic regression analyses with adjustment for age, sex, pack-years of smoking and family history of cancer. The stepwise regression procedures were used for constructing the final logistic regression model by including all significant predictors identified in the single locus analysis. For the combined analyses, we categorized all risk alleles from each gene into a new variable according to the number of risk alleles (for the protective minor allele, we assumed the common allele as the risk allele). We used the risk allele categories of 0, 1-5, 6-7 and≥8 for the genes with 5 or more SNPs (i.e., ERCC1, ERCC2, ERCC5, and XPC) and 0, 1-2, and≥3 for the gene with 2-3 SNPs (i.e., ERCC3, ERCC4, DDB2, and XPA). All these categories were coded as dummy variables and evaluated by multivariate logistic regression analyses. A new variable of a combination of all NER genes, representing risk allele levels of the pathway, was created by calculating the sum of 20 dummy variables derived from the 8 NER genes described above and categorized as 0-10 (ref.), 11, 12 and≥13, which was used for the evaluation of dose-response relationship as well as for their joint effect with cumulative smoking. All the statistical analyses were performed with Statistical Analysis System software (v.8.0e; SAS Institute, Cary, NC).
The distributions of selected characteristics between lung cancer patients and controls were summarized in Table 1. There was no significant difference in the distributions of age and sex between the cases and the controls, suggesting that our frequency matching was adequate. However, the controls were more likely to be non-smokers (47.7%) than were the cases (30.2%), and more cases (45.1%) smoked greater than 30 pack-years than did the controls (25.5%). These differences were statistically significant (P<0.000 1). Furthermore, 17.1% of the lung cancer cases reported a family history of cancer in their first-degree relatives, which were significantly higher than that of the controls (12.8%), and this difference accounted for a significantly 41% increased lung cancer risk (OR=1.41, 95% CI=1.10~1.81). Among the 1 010 cancer patients, 430 (42.6%) were adenocarcinoma, 335 (33.2%) squamous cell carcinoma, 65 (6.4%) small cell carcinoma and 180 (17.8%) large cell, mixed cell or undifferentiated carcinoma.
Table1Distribution of select variables in lung cancer cases and cancer-free controls
VariableCases(n=1010)Controls(n=1011)No.%No.%Pvalue*Age(years)0.9824 ≤6050049.550049.5 >6051050.551150.5Sex0.3037 Male77776.975875.0 Female23323.125325.0Smokingstatus<0.0001 Non-smokers30430.248247.7 Former-smokers30630.416115.9 Current-smokers39839.536836.4Pack-yearsofsmoking<0.0001 030430.448247.71-2924524.527126.830+45045.125725.5Familyhistoryofcanc-er0.0059 No83782.988287.2 Yes17317.112912.8Histologicaltypes Adenocarcinomas43042.6 Squamouscell33533.2 Smallcell656.4 Othercarcinomas*18017.8
a2 missing values in cases;b11 missing values in cases, 1 missing value in controls;
*Two-sidedχ2test.※Other carcinomas included the large cell, mixed cell or undifferentiated carcinoma.
Of the 40 selected SNPs we genotyped for all patients and controls, 4 were nsSNPs and 36 were non-coding variants (tagSNPs) (Table 2). All genotype distributions in the control subjects were consistent with those expected from Hardy-Weinberg equilibrium except for four SNPs (i.e., rs4150416 in ERCC3, rs3136099 in ERCC4, rs3176639 in XPA, and rs1126547 in XPC), which were excluded from subsequent analyses (data not shown). As shown in Table 2, about half of the SNPs in this study population had a MAF 10% greater or lesser than those reported in the Environmental Genome Project SNP database [http://egp.gs.washington.edu/directory.html], which may reflect either ethnic differences or frequency bias due to small sample sizes from which the database derived.
Table 2 Primary information of selected SNPs of 8 core genes in the NER pathway
*AA: Amino Acid.※MAF: minor allele frequencies. EGP: US Environmental Genome Project SNPs database (see http://egp.gs.washington.edu/directory.html).?NA: Not available in NCBI as well as in EGP database because these 2 SNPs were newly identified by the authors in Chinese.
As shown in Fig. 1, a total of 6 SNPs in four genes (i.e., ERCC1, DDB2, ERCC4 and XPC) were significantly associated with lung cancer risk in the single locus analysis. For example, in the dominant-effect model, significantly protective effects were associated with the variant-allele containing genotypes of rs1007616 in ERCC1 (adjusted OR=0.76; 95% CI=0.62~0.92), rs3212948 in ERCC1 (adjusted OR=0.76; 95% CI=0.63~0.91), rs3136038 in ERCC4 (adjusted OR= 0.83; 95% CI=0.69~1.00) and rs3731055 in XPC (adjusted OR=0.83; 95% CI=0.69~1.00), but elevated risks were associated with those of rs830083 in DDB2 (adjusted OR=1.32; 95% CI=1.09~1.59) and rs3781620 in DDB2 (adjusted OR=1.27; 95% CI=1.05~1.53) (Fig.1).
Fig.1 Logistic regression analysis of the associations between the genotypes of selected SNPs of the NER genes and risk of lung cancer both in the dominant (former line of each locus) and recessive model (latter line). The ORs and 95% CIs were adjusted for age, sex, pack-years of smoking and family history of cancer.
In multivariate analyses, the above-mentioned statistically significant 6 SNPs in the single locus analysis were subject to additional stepwise logistic regression procedures with adjustment for age, sex, pack-years of smoking and family history of cancer. Both the forward and backward stepwise logistic regression procedures showed that 3 SNPs (ERCC1 rs3212948, DDB2 rs830083 and ERCC4 rs3136038) remained significant predictors for lung cancer risk. As covariant in the same model, pack years of smoking and family history of cancer were also the main risk factors of lung cancer (Table 3).
Table 3 ORs and 95%CIs for the SNPs included in the final logistic regression model by the forward stepwise procedure
Table 4 The joint effect on lung cancer risk (OR, 95%CI) by cumulative smoking and the NER combined variables
*The NER combined variables were the sum of 20 dummy variables from the 8 NER genes and were categorized as 0-10 (ref.) and 10-20.
#Adjusted for age, sex and family history of cancer in a logistic regression model.
The effects of combined risk alleles of each gene and all the 8 core NER genes are shown in Fig 2. Compared with the reference group without risk alleles, a significant allele dose-response effect on lung cancer risk was observed for ERCC1 (Ptrend<0.0001), ERCC2 (Ptrend=0.0012), ERCC3 (Ptrend=0.0088), ERCC5 (Ptrend=0.0178), XPA (Ptrend=0.0010), and XPC (Ptrend=0.0233), but not for ERCC4 (Ptrend=0.0995) and
Fig.2 Logistic regression analyses of the associations between the numbers of combined risk alleles of each gene and all genes in the NER pathway and risk of lung cancer. The number of risk alleles were categorized as 0, 1-5, 6-7 and≥8 for the gene with 5 or more SNPs (i.e., ERCC1, ERCC2, ERCC5, and XPC) and 0, 1-2 and≥3 for the gene with 2-3 SNPs (i.e., ERCC3, ERCC4, DDB2, and XPA). The combined NER was the sum of 20 dummy variables from 8 NER genes and categorized as 0-10(ref.), 11, 12 and ≥13. The ORs and 95% CIs were adjusted for age, sex, pack-years of smoking and family history of cancer. P<0.05, P<0.01, P<0.001
DDB2 (Ptrend=0.2789) (Fig 2). For the four levels (i.e., 0~10, 11, 12 and≥13) of the combined NER categorical variable derived from the sum of all 20 dummy variables of the 8 core NER genes, the risk of lung cancer was significantly increased as the increase of the variable from 0-10 to≥13 (Ptrend<0.000 1) (OR=1.54, 95%CI=1.25~1.90; OR=1.66, 95%CI=1.25~2.21; and OR=2.79, 95% CI=1.99~3.91 for the risk allele levels of 11, 12 and≥13, respectively).
In this case-control study, we found that ERCC1 rs3212948, DDB2 rs830083 and ERCC4 rs3136038 were significant predictors for the risk of lung cancer. In the analyses of combined alleles for each gene, the risk of lung cancer was significantly increased as the number of risk alleles increased for ERCC1, ERCC2, ERCC3, ERCC5, XPA, and XPC and for the combined variable of 8 core NER genes. These findings indicate that some representative tagging SNPs of the NER gene and their surrounding regions may be susceptible biomarkers for lung cancer risk, which warrant further validation. The findings also suggest that a pathway-based analysis of multiple polymorphisms in the same gene or pathway is likely to provide a better picture of the underlying mechanisms of certain variants and reduce the possibility of false negative results.
Biologically, the 15.3-kb ERCC1 gene codes for a 5' incision subunit of the NER complex. Damage-specific DNA binding (DDB) activity is one of the functions of a heterodimer p127 (DDB1) and p48 (DDB2) purified from HeLa cells[13], and mutations in the DDB2 gene disrupt DDB activity in the subset of XPE cells[14]. In addition, there was a dose-response relationship between the numbers of putative risk alleles and lung cancer risk for XPA, XPC and ERCC5. Both XPA and XPC were suggested being the initial DNA damage recognition factors, and the XPA-RPA complex also served a verification of various DNA lesions[15]. Mutations in the XPG gene result in the XP or XP-Cockayne syndrome (CS) combined phenotypes, and patients suffering from defective XP-CS complex exhibit developmental retardation, dwarfism, severe neurologic abnormalities, and sun sensitivity[16-17].
Associations between genetic variants in the NER genes and susceptibility to cancer have been extensively studied in many cancer sites including lung cancer, but the results are conflicting, giving an incomplete view on the effects of these variants[10]. The nsSNPs of the ERCC2/XPD gene are among the most commonly investigated SNPs in lung cancer association studies[18-19]. In meta-analyses, elevated risks associated with the variant alleles of ERCC2 312Asn and 751Gln were confined to the fixed combination model but not in the random effect model[20-21]. In our case-control study, we also observed an elevated risk associated with the variant alleles of ERCC2 312Asn, 751Gln and their combined risk alleles, however, the associations for these two single locus were not statistically significant, indicating the low penetrance of this gene in this Chinese population.
The His1104Asp in ERCC5, poly (AT) (PAT) and Lys939Gln in XPC, A8092C in ERCC1 were also investigated for their associations with the risk of several cancers[22-29]. Overall, all these above association studies were based on the single or several available SNPs in one or limited number of genes and risk of cancer, resulting in inconsistent results, particularly for those studies with relatively small sample sizes.
In conclusion, our study attempted to use the pathway-based candidate gene approach to evaluate associations of SNPs in 8 core NER genes with lung cancer risk in a case-control study. The findings provide further evidence to support the important role of genetic susceptibility conferred by variants of the NER pathway in the etiology of lung cancer. Validation of these findings in larger studies of other populations is needed. Further studies with functional evaluation of the SNPs are warranted to replicate and extend the significance of these findings.
[1] Zhang H, Cai B. The impact of tobacco on lung health in China[J]. Respirology,2003,8(1):17-21.
[2] Yang L, Parkin DM, Li L, et al.. Time trends in cancer mortality in China: 1987-1999[J]. Int J Cancer,2003,106(5): 771-783.
[3] Parkin DM, Pisani P, Lopez AD,et al. At least one in seven cases of cancer is caused by smoking[J]. Global estimates for 1985. Int J Cancer,1994,59(4): 494-504.
[4] Friedberg EC. DNA damage and repair[J]. Nature,2003,421(6921): 436-440.
[5] Hoeijmakers JH. Genome maintenance mechanisms for preventing cancer[J]. Nature,2001,411(6835): 366-374.
[6] Friedberg EC. How nucleotide excision repair protects against cancer[J]. Nat Rev Cancer, 2001,1(1): 22-33.
[7] Spitz MR, Wu X, Wang Y, et al. Modulation of nucleotide excision repair capacity by XPD polymorphisms in lung cancer patients[J]. Cancer Res,2001,61(4): 1354-1357.
[8] Spitz MR, Wei Q, Dong Q, et al. Genetic susceptibility to lung cancer: the role of DNA damage and repair[J]. Cancer Epidemiol Biomarkers Prev,2003,12(8): 689 -698.
[9] Kiyohara C, Otsu A, Shirakawa T, et al. Genetic polymorphisms and lung cancer susceptibility: a review[J]. Lung Cancer,2002,37(3): 241-256.
[10] Goode EL, Ulrich CM, Potter JD. Polymorphisms in DNA repair genes and associations with cancer risk[J]. Cancer Epidemiol Biomarkers Prev,2002,11(12): 1513-1530.
[11] Hu Z, Shao M, Yuan J, et al.Polymorphisms in DNA damage binding protein 2 (DDB2) and susceptibility of primary lung cancer in the Chinese: a case-control study[J]. Carcinogenesis,2006,27(7): 1475-1480.
[12] Carlson CS, Eberle MA, Rieder MJ, et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium[J]. Am J Hum Genet,2004,74(1): 106-120.
[13] Nichols AF, Itoh T, Graham JA,et al. Human damage-specific DNA-binding protein p48. Characterization of XPE mutations and regulation following UV irradiation[J]. J Biol Chem,2000,275(28): 21422-21428.
[14] Hwang BJ, Ford JM, Hanawalt PC, et al. Expression of the p48 xeroderma pigmentosum gene is p53-dependent and is involved in global genomic repair[J]. Proc Natl Acad Sci USA,1999,96(2): 424-428.
[15] Thoma BS, Vasquez KM. Critical DNA damage recognition functions of XPC-hHR23B and XPA-RPA in nucleotide excision repair[J]. Mol Carcinog,2003,38(1): 1-13.
[16] Moriwaki S, Stefanini M, Lehmann AR, et al. DNA repair and ultraviolet mutagenesis in cells from a new patient with xeroderma pigmentosum group G and cockayne syndrome resemble xeroderma pigmentosum cells[J]. J Invest Dermatol,1996,107(4): 647-653.
[17] Rapin I, Lindenbaum Y, Dickson DW, et al. Cockayne syndrome and xeroderma pigmentosum[J]. Neurology,2000,55(10): 1442-1449.
[18] Hu Z, Wei Q, Wang X, et al. DNA repair gene XPD polymorphism and lung cancer risk: a meta-analysis[J]. Lung Cancer,2004,46(1): 1-10.
[19] Benhamou S, Sarasin A. ERCC2/XPD gene polymorphisms and lung cancer: a HuGE review[J]. Am J Epidemiol,2005,161(1): 1-14.
[20] Sanyal S, Festa F, Sakano S, et al. Polymorphisms in DNA repair and metabolic genes in bladder cancer[J]. Carcinogenesis,2004,25(5): 729-734.
[21] Jeon HS, Kim KM, Park SH, et al. Relationship between XPG codon 1104 polymorphism and risk of primary lung cancer[J]. Carcinogenesis,2003,24(10): 1677-1681.
[22] Kumar R, H?glund L, Zhao C, et al. Single nucleotide polymorphisms in the XPG gene: determination of role in DNA repair and breast cancer risk[J]. Int J Cancer,2003,103(5): 671-675.
[23] Shen H, Sturgis EM, Khan SG, et al. An intronic poly (AT) polymorphism of the DNA repair gene XPC and risk of squamous cell carcinoma of the head and neck: a case-control study[J]. Cancer Res,2001,61(8): 3321-3325.
[24] Marín MS, López-Cima MF, García-Castro L,et al. Poly (AT) polymorphism in intron 11 of the XPC DNA repair gene enhances the risk of lung cancer[J]. Cancer Epidemiol Biomarkers Prev,2004,13(11 pt 1): 1788-1793.
[25] Hu Z, Wang Y, Wang X, et al. DNA repair gene XPC genotypes/haplotypes and risk of lung cancer in a Chinese population[J]. Int J Cancer,2005,115(3): 478-483.
[26] Lee GY, Jang JS, Lee SY, et al. XPC polymorphisms and lung cancer risk[J]. Int J Cancer, 2005,115(5): 807-813.
[27] Chen P, Wiencke J, Aldape K, et al. Association of an ERCC1 polymorphism with adult-onset glioma[J]. Cancer Epidemiol Biomarkers Prev,2000,9(8): 843-847.
[28] Sturgis EM, Dahlstrom KR, Spitz MR, et al. DNA repair gene ERCC1 and ERCC2/XPD polymorphisms and risk of squamous cell carcinoma of the head and neck[J]. Arch Otolaryngol Head Neck Surg,2002,128(9): 1084-1088.
[29] Zhou W, Liu G, Park S, et al. Gene-smoking interaction associations for the ERCC1 polymorphisms in the risk of lung cancer[J]. Cancer Epidemiol Biomarkers Prev,2005,14(2): 491-496.