Meng-Dan Lan,Bi-Feng Yuan*,Yu-Qi Feng
Key Laboratory of Analytical Chemistry for Biology and Medicine(Ministry of Education),Department of Chemistry,Wuhan University,Wuhan 430072,China
Key words:Mass spectrometry analysis Chemical derivatization DNA modification RNA modification
ABSTRACT Now increasing chemical modifications are discovered on genomic DNA and RNA.Up to date,more than 150 chemical modifications are identi fied in nucleic acids.These chemical modifications do not change the sequence of DNA and RNA,but alter their structures and biochemical properties,and eventually control or regulate the spatial and temporal expression of genes.Elucidation of the functional roles of these modifications is vital to our understanding of living organisms.However,the modifications in DNA and RNA generally have extremely low abundance in vivo.Therefore,sensitive and specific detection methods are essential to decipher the functional roles of these modifications.Chemical derivatization in combination with mass spectrometry(MS)analysis has been proved to be a promising strategy to efficiently analyze these modifications in DNA and RNA.In the last several years,many chemical derivatization-MS-based analytical methods were established for the sensitive and effective analysis of nucleic acid modifications.In this review,we summarize the recent advances for deciphering modifications in DNA and RNA by chemical derivatization-MS analysis.We hope this review can stimulate the future studies of DNA and RNA modifications.
In addition to the canonical nucleobases of adenine(A),guanine(G),cytosine(C),thymine(T),and uracil(U),many chemical modifications have been identi fied in both DNAand RNA[1,2].These chemical modifications do not change the sequence of DNA and RNA,but alter their structures and biochemical properties,and eventually regulate the spatial and temporal expression of genes[3].
DNA cytosine methylation(5-methylcytosine,5-m C)is the best-characterized epigenetic modification in genomic DNA[4].5-m C modification has been demonstrated to involve in diverse physiological functions[5].In recent years,increasing numbers of modifications on genomic DNA have been discovered.Recent studies showed that the ten eleven translocation(TET)proteins were capable of converting 5-m C to generate 5-hydroxymethylcytosine(5-hm C),5-formylcytosine(5-foC),and finally to 5-carboxylcytosine(5-caC)[6–8].In addition to DNA cytosine methylation,DNA adenine methylation (N6-methyladenine,m6A)has also been discovered in eukaryotic cells[9,10].Similar to 5-m C,these novel modifications identi fied in DNA have been considered to play critical roles on the regulation of multiple physiological processes as well[11,12].
RNAisthe intermediate molecule that linksgenetic information from DNAto proteins.Given the important regulatory roles of DNA modification,the recent discovery of reversible modifications on RNA has opened a newera of post-transcriptional gene regulation in eukaryotes[13].RNA is particularly rich in modifications.Cellular RNA contains more than 150 structurally distinct modifications[14],many of which are considered to be dynamic,reversible and can fine-tune the structuresand functions of RNAto influence gene expression[2,13].
The modifications in DNA and RNA generally have extremely low abundance in vivo[15,16].For example,the contents of 5-hm C in both DNA and RNA can be as lowas several modifications per million nucleosides[17,18].Therefore,sensitive and specific detection methods are essential to dissect the functional roles of these modifications.The developments of new technologies largely revolutionize the epigenetic modification fi eld.Ow ing to the inherent sensitivity and selectivity,mass spectrometry(MS)has become one of the most prominent analytical techniques[19–21].However,many DNA and RNA modifications cannot be favorably analyzed by MS,especially for the low abundant modifications.
Fig.1.The schematic diagram for analysis of DNA and RNA modifications by chemical derivatization-MS-based strategy.Blue circle,red pentagram and yellow pentagon represent nucleobase,modification groups and derivatization reagents,respectively.
Chemical derivatization has been proved to be a promising strategy to improve the detection performance of analytes in MS[22–24].Chemical derivatization can alter the chemical and physical properties of analytes.The integration of chemical derivatization with MS analysis can improve the separation and enhance the ionization ef fi ciency during electrospray ionization(ESI)-MSanalysis,which eventually lead to the improved detection performance[22].In the last several years,others and our group have developed various methods for the sensitive and selective analysis of DNAand RNAmodifications by chemical derivatization-MS-based strategy(Fig.1).In this review,we summarized the recent advances for deciphering modifications in nucleic acids by chemical derivatization-MSanalysis.Moreover,we discussed their merits and defects and provide typical examples that utilized these techniques to address biological questions.
Many DNA and RNA modifications cannot be readily observed by direct MS analysis,especially for the low abundant modifications.In the last decade,the strategy of chemical derivatization in combination with MSanalysis has largely improved the detection of modifications in DNA and RNA and promoted the functional studies of these modifications.Since the MSplatforms used for the analysis of derivatives are inherently different,we therefore categorized the methods by different MS platforms,i.e.,LC/MS,MALDI/MS and GC/MS.We discussed and summarized the advantages and limitations of different methods(Table 1).
2.1.1.DNA modifications
5-hm C is now viewed as the sixth base of the genome in mammals besides A,C,T,G,and 5-m C[25].5-hm C content in genomic DNA of mammalian cells is low and the quanti fi cation of 5-hm C by LC/MS frequently suffers from ion suppression by the presence of high abundant canonical nucleosides.To address the issue,we developed a method by using T4 b-glucosyltransferase to selectively add a glucosyl moiety to the hydroxymethyl group of 5-hm C and form a more hydrophilic residue of b-glucosyl-5-hydroxymethyl-2'-deoxycytidine(5-gmd C),which can be selectively enriched using NH2-silica via hydrophilic interaction followed by LC/MS analysis[26].Ion suppression from normal nucleosides during MS analysis was then avoided,which contributed to the increased signal intensity of 5-hm C.With the established method,5-hm C was readily detected in genomes of three human cultured cells as well as seven yeast strains,which was the fi rst report for the existence of 5-hm C in the model organism of yeast.
The active DNA demethylation by TET protein-catalyzed oxidation of 5-m C in mammals raised the possible presence of similar demethylation mechanism of 5-m C in plants.To achieve sensitive and quantitative analysis of the relatively low content of 5-foC and 5-caC in DNA of plants,we used Girard’s reagent D(Gir D),Girard’s reagent T(Gir T),and Girard’s reagent P(Gir P)that harbor easily ionized tertiary or quaternary ammonia,to simultaneously react with aldehyde group of 5-foCand carboxyl group of 5-caC under mild conditions(Fig.2A)[27].With the established highly sensitive detection method,5-foC and 5-caC in genomic DNA of Arabidopsis thaliana were distinctly detected( Figs.2B–E).However,it should be noted that the simultaneous derivatization of 5-fo Cand 5-caCwas performed sequentially,which m akes the analytical procedure relatively complicated.Similarly,Honget al.[28]determined 5-formyl-2'-deoxyuridine(5-foU)by derivatization with Gir T that harbors a pre-charged quaternary ammonium moiety.The combination of derivatization with LC/MS analysiscould allow for the quanti fi cation of 5-foUat a detection limit of 3-4 fmol,which is approximately 20-fold better than that for the direct analysis of native 5-foU.
Table 1 Summary of the advantages and limitations of different methods.
5-m Cand its oxidation products(5-hm C,5-foCand 5-caC)differ largely in their abundance in DNA.While direct and simultaneous quanti fi cation of these cytosine modifications is challenging,we developed 2-bromo-1-(4-diethylaminophenyl)-ethanone(BDAPE)derivatization coupled with LC/MS analysis for the sensitive and simultaneous determination of all these four cytosine modifications(5-m C,5-hm C,5-foC,and 5-caC)[29].The derivatization reagent that harbors a hydrophobic phenyl group and an easily chargeable tertiary ammonium group can simultaneously derivatize all the four cytosine modifications.BDAPEreadily reacts with the 3-N and 4-N positions of cytosine to form a stable penta cyclic structure.The detection sensitivities of these modifications increased by 35-123 folds after BDAPE derivatization.With this method,we observed the signi fi cant depletion of 5-hm C,5-foC,and 5-caC in human colorectal carcinoma tissues compared to tumor-adjacent normal tissues,suggesting the potential roles of these modifications in the formation of cancers.Since salts can cause the low derivatization ef fi ciency of cytosine modificationsby BDAPE,puri fi cation of enzymatically digested nucleosides by SPE to remove salts was essential,which however may introduce relatively large deviation.
In addition to BDAPE derivatization,Guoet al.[30]recently employed 4-(dimethylamino)benzoic anhydride to simultaneously derivatize the amino group of four cytosine modifications(5-m C,5-hm C,5-foC,and 5-caC)in DNA.The limits of detection(LODs)ranged from 1.2 fmol to 2.5 fmol.With the established method,they found that the contents of 5-foCand 5-caCincreased in human breast cancer tissue compared with tumor-adjacent normal tissue.More recently,Huet al.[31]developed a strategy of 6-methoxy-2-naphthyl glyoxal hydrate(MTNG)derivatization coupled with online solid-phase extraction(SPE)and LC/MS(SPE-LC/MS)for analysis of 8-nitroguanine,a major mutagenic nucleobase modification generated by peroxynitrite.After derivatization,the sample was analyzed by online SPE-LC/MS and the detection sensitivity increased approximately 10 times.This method was then successfully used to explore the correlation between in fl ammation-related DNA damage and carcinogenesis.However,MTNG can react with both 8-nitroguanine and other guanine compounds,thus an excess of MTNG was added and SPE was required to remove MTNG prior to MS analysis.
Fig.2.(A)The schematic diagram for the determination of 5-foC and 5-caC in genomic DNA of plant samples using chemical derivatization of Girard’s reagents coupled with LC/MSanalysis.Representative MRM chromatograms for quanti fi cation of(B)5-foCand(C)5-caC in genomic DNA of Arabidopsis thaliana leaves and derivatization products of(D)5-foCstandard and(E)5-caCstandard.Copied with permission[27].Copyright 2014,American Chemical Society.
2.1.2.RNA modifications
Active DNA dem ethylation in mam mals can be achieved through oxidation of 5-m C by TET family proteins with the generation of 5-hm C,5-foC,and 5-caC[11].TETcan also catalyze the formation of 5-hm C from 5-m C in RNA[17].To explore the possible existence of the further oxidative product of 5-foCfrom 5-hm Cand to achieve simultaneous detection of 5-hm Cand 5-foC in RNA,a strategy of oxidation-derivatization combined with MS(ODMS)was developed[32].In this strategy,Mn O2was utilized to oxide 5-hm Cto 5-foC.The aldehyde group in 5-foCcan be readily derivatized with the hydrazide moiety in dansylhydrazine(DNSH)to yield hydrazone derivative with an easily chargeable tertiary ammonium,enabling the increased ionization ef fi ciency of 5-fo Cduring LC/MSanalysis(Fig.3A).With this ODMSstrategy,we fi rst reported the presence of 5-foC in RNA in mammal cells( Figs.3B and C).The quanti fi cation results showed that 5-foC in RNA were 9.0?1.2/106r G in HeLa cells and 8.5?1.4/106r G in 293Tcells.The detectable 5-foCin cellular RNA together with the presence of 5-hm C in RNA suggested that the function of TET family proteins can exert epigenetic regulation at both DNA and RNA.
Fig.3.(A)The schematic diagram for the determination of 5-hm Cand 5-foCin both DNA and RNA by oxidation-derivatization coupled with LC/MS analysis.(B)Extracted ion chromatograms of 5-hydroxymethyl-2'-deoxycytosine standard(i)and 5-hm Cdetected in DNA of HeLa cells(ii)by ODMSstrategy,and endogenous 5-foC detected in DNA of HeLa cells by DNSH derivatization(iii).(C)Extracted ion chromatograms of 5-hydroxymethylcytosine standard(i)and 5-hm C detected in RNA of HeLa cells(ii)by ODMSstrategy,and endogenous 5-foCdetected in RNA of HeLa cells by DNSH derivatization(iii).Copied with permission[32].Copyright 2016,Royal Society of Chemistry.
Fig.4.(A)Chemical derivatization of the cytosine modifications in RNA.(B)Extracted ion chromatograms of 5-m C,5-hm C,5-foC,and 5-caCbefore(A)and after(B)labeling by BDEPE under optimized conditions.Copied with permission[33].Copyright 2016,Royal Society of Chemistry.
The discovery of 5-hm C and 5-foC in RNA indicated 5-m C in RNA may undergo the same cytosine demethylation pathw ay with generating 5-hm C,5-foC,and 5-carboxylcytosine(5-caC)by TET proteins as that in DNA.To explore the existence of 5-caCin RNA,we established 2-bromo-1-(4-diethylaminophenyl)-ethanone(BDEPE)derivatization coupled with LC/MSanalysis for sensitive and simultaneous determination of the oxidative products of 5-m C in RNA(Fig.4A)[33].The results demonstrated that the detection sensitivities of 5-m C,5-hm C,5-foCand 5-caCin RNA increased by 70-313 folds after BDEPEderivatization( Figs.4Band C).Using this method,we con fi rmed the existence of 5-caCin RNA of mammals,indicating the possible demethylation pathw ay of 5-m Cin RNA by oxidation with TETproteins.Using the same analytical strategy,we further demonstrated that the levels of 5-hm C,5-fo C,and 5-caC significantly decreased in both the DNA and RNA of mouse embryonic stem cells while exposed to arsenic,cadmium,chromium,and antimony,which suggested a new toxicity mechanism by heavy metals through dysregulating the epigenetic modifications[34].
In addition to chemical derivatization-MSanalysis,we recently developed a strategy of Gir Pderivatization combined with in-tube solid-phase microextraction(SPME)and LC/MS(SPME-LC/MS)analysis for the sensitive determination of DNA and RNA formylation(Fig.5A)[35].The monolith carrying negativecarboxyl group can enrich the positively charged derivatives and eliminate high abundance of normal nucleosides,which further improved the detection performance for analysis of DNA and RNA formylation.Using the method,we were able to simultaneously detect six formylated nucleosides,including 5-foC and 5-formyl-2'-deoxyuridine(5-fod U)from DNA,and 5-foC,5-foU,2'-O-methyl-5-formylcytidine(5-foCm)and 2'-O-methyl-5-formyluridine(5-foUm)from RNA of cultured human cells and multiple mammalian tissues( Figs.5B and C).The detection limits of these formylated nucleosides were improved by 307-884 folds.It was w orth noting that 5-foU,5-foCm and 5-foUm were discovered for the fi rst time in cultured human cells and tissues.This method requires the preparation of monolith carrying negativegroup,which is tedious and may limit its application in different laboratories.
Along with ESI-MS,the utility of inductively coupled plasma mass spectrometry(ICP-MS)is an alternative and complementary tool in study of nucleic acid modifications.Wrobelet al.[36]proposed a strategy enabling sensitive detection of 5-m C in RNA based on LC-ICP-MSdetection.The strategy relieson derivatization of ribose with osmium(Os)by formation of a ternary complex between cis-diol ribose groups,K2OsO2(OH)4and tetramethylethylenediamine.The obtained detection limit of 5-m C was 21 pmol/L,demonstrating a sensitive quanti fi cation of 5-m C.
2.1.3.Free nucleoside/nucleotide modifications
Fig.5.(A)The schematic illustration of the analytical procedure by Gir P derivatization combined with in-tube SPME-LC/MS analysis for the sensitive determination of DNA and RNA formylation.(B)Extracted-ion chromatograms of Gir P-labeled 5-foCand 5-foU from both DNA and RNA of human thyroid carcinoma tissue and d5-Gir P-labeled nucleoside standards.(C)Extracted-ion chromatograms of Gir P-labeled 5-for Cm and 5-for Um from RNAof human thyroid carcinoma tissue.Copied with permission[35].Copyright 2017,Elsevier.
Fig.6.General procedure for the comprehensive pro fi ling of modified nucleosides from biological fluids using metal oxide-based dispersive solid-phase extraction followed with stable isotope labeling and double neutral loss scan-mass spectrometry analysis.Copied with permission[37].Copyright 2015,American Chemical Society.
Detection of endogenous modified nucleosides in biological fluidsmay serve asa non-invasive manner for diseasesdiagnostics.We developed a strategy for comprehensive pro fi ling of modified nucleosides from biological fluids using metal oxide-based dispersive solid-phase extraction(DSPE)followed with stable isotope labeling and double neutral loss scan-mass spectrometry analysis(Fig.6)[37].Cerium dioxide(CeO2)was used to selectively capture ribose conjugates from complex biological samples under basic environment.The enriched nucleosides were then derivatized with acetone and acetone-d6.The acetone and acetone-d6derivatized compounds were ionized at the same condition but recorded separately on MS spectra,which can significantly improve the detection specificity and promote the identi fi cation of modified nucleosides.Using the developed method,we pro fi led the modified nucleosides in human urine and 49 ribose conjugates were readily identi fied,among which 7 ribose conjugates exhibited signi fi cant contents change between healthy controls and lymphoma patients.Later,Liet al.[38]used the similar strategy to identify 52 modified nucleosides in urine sample.Since this analytical strategy depends on the enrichment of nucleosides that carry the cis-diol ribose,the 2'-O-methylation nucleosides cannot be captured and detected.
In addition to methyltransferase-mediated DNA and RNA methylation,premethylated nucleotides can be potentially incorporated into DNA and RNAduring replication and transcription.To explore the possible existence of endogenous modified nucleotides,we established a method by N,N-dimethyl-p-phenylenediamine(DMPA)derivatization coupled with LC/MSfor sensitive and simultaneous determination of 10 nucleotides,including 5-methyl-2'-deoxycytidine monophosphate(5-Me-d CMP)and 5-methylcytidine monophosphate(5-Me-CMP)(Fig.7A)[39].After DMPA derivatization,the detection sensitivities of nucleotides increased by 88-372 folds,which can be attributed to the introduction of a tertiary ammonia group and a hydrophobic moiety from DMPA.Using this method,we found that endogenous 5-Me-d CMP and 5-Me-CMP w idely existed in cultured human cells,human tissues,and human urinary samples(Fig.7B).This study is the fi rst report for detection of endogenous 5-Me-d CMP and 5-Me-CMPin mammals.Very recently,we further established 8-(diazomethyl)quinoline derivatization in conjugation with LC/MSanalysis for determination of endogenous modified nucleoside triphosphates(NTPs)in the mammalian cells and tissues[40].The synthesized 8-(diazomethyl)quinoline could efficiently react with the phosphate group of NTPs under mild condition.The developed method allowed the sensitive detection of NTPs with the detection limits improved by 56-137 folds.With this method,12 types of endogenous modified NTPs were distinctly determined in the mammalian cells and tissues.These studies revealed the w idespread existence of various modified NTPs in eukaryotes,which may provide new source for modifications of DNA and RNA.The 8-(diazomethyl)quinoline is less stable since the diazo group is easy to degrade,which requires extra care while performing the derivatization reaction.
Fig.7.(A)Chemical labeling of nucleotides by DMPA.“B”in nucleotides represents nucleobase.(B)Extracted-ion chromatograms of DMPA-d 4-labeled 5-Me-d CMPand 5-Me-CMPstandards,DMPA-labeled 5-Me-d CMPand 5-Me-CMPfrom the human urine,and 293Tcells and HeLa cells.Copied with permission[39].Copyright 2017,American Chemical Society.
Pseudouridine(C),the isomerized form of uridine with C5 and N1 position interconversion,is a mass-silent modification in RNA,which results in the inability of normal MS to identify C.To address the issue,Pattesonet al.[41]developed a method of 1-cyclohexyl-3-(2-morpholinoethyl)carbodiimide(CMC)derivatization combined with matrix-assisted laser desorption/ionization-MS(MALDI/MS)analysis for the determination of C in RNA.After CMC derivatization,each C exhibited a mass increase of 252 Da,which allow s for easy characterization of C by MALDI/MS(Fig.8).To determine the sequence location of C,MALDI/MS analysis of RNase T1 digestion products before and after CMC derivatization wasperformed.However,the CMCwasalso found to modify other uridine nucleosides,which potentially can be problematic w hen analyzing RNAs that have many uridine residues.Later,the same group improved the CMCderivatization conditions by optimizing the reaction time,temperature,p H and the ratio of CMC amount to sample[42].Under the optimized derivatization conditions,false positive by incomplete derivatization can be minimized.This approach can provide information for specific small RNAs that contain C[43].
Fig.8.The schematic illustration of CMCderivatization combined with MALDI/MS analysis for the determination of C residues in RNA.After CMCderivatization,each C residue exhibits a massincrease of 252 Da,which allow sfor easy characterization of C by MALDI/MS.
Specific cyanoethylation of N1 in C hasbeen know n for decades[44].Compared to CMC derivatization,C can be completely cyanoethylated while normal uridine is only?5%modified,enabling a single-step derivatization process without the effect of destroy alkali-labile backbone of RNA.Using this derivatization strategy,Mengel-Jùrgensenet al.[45]employed acrylonitrile to derivatize C in tRNA.Cyanoethylation of C led to a 53 Da mass increase,which can be easily characterized in MALDI/MSanalysis.With this strategy,one C in tRNATyrIIfrom E.coli was successfully identi fied.MALDI/MS in combination with cyanoethylation has been demonstrated to be a useful complement to the CMC derivatization with MSdetection.
Gas chromatography-MS(GC/MS)has been w idely used in the determination of trace amounts of compounds[46,47].However,DNA and RNA modifications cannot be directly analyzed by GC/MS since these compounds are not volatile.Therefore,chemical derivatization is typically used to convert polar nucleobases/nucleosides into volatile derivatives before GC/MSanalysis.
Singeret al.[48]developed a method for the detection of 5-m C in genomic DNA of calf thymus,salmon sperm and several mouse tissues.DNA was hydrolyzed with 88%formic acid at 180?Cinto nucleobases followed by bis(trimethylsilyl)tri fl uoroacetamide(BSTFA)derivatization.The resulting derivatives of d C and 5-m C can be easily separated without interference of d A,d G and T.As little as 1.6 pmol of 5-m Cin DNA can be detected.Later,Romerioet al.[49]introduced tw o isotopically labelled internal standards,[2-13C]5-methylcytosine and[2-13C]cytosine,which provided more accurate quanti fi cation of 5-m C in genomic DNA.The method was successfully applied to the detection of 5-m Cin DNA of peripheral blood mononuclear cell.
In 2012,our group developed a highly sensitive and reliable method for the systematic study on the existence of 5-m Cin yeast genomes by GC/MS[50].The extracted DNA was hydrolyzed to purines and pyrimidines with 88%aqueous formic acid,followed by derivatization with BSTFA and cholorotrimethylsilane [13_TD DIFF]IFF]in acetonitrile.The results showed that the LOD of 5-m C was 0.8 pg(6.4 fmol).Using this developed method,we found that 5-m C was present in 19 yeast strains within the range of 0.014%-0.364%,indicating a w idespread DNA methylation in yeast genomes.
The recent advances demonstrate a dynamic view of DNA and RNA modifications,which increase the diversity of nucleic acids,and more importantly,add additional layers to the regulation of physiological processes.Investigations of the functions of DNA and RNA modifications provide valuable insights into the understanding of the molecular mechanism of diseases.Continued and rapid improvements in techniques make the study of DNA and RNA modifications more accessible.In this review,we focused on the chemical derivatization-MSbased analytical strategy for deciphering DNA and RNA modifications.
Design and use of proper derivatization reagents to achieve fast,efficient and specific labeling are important for the study of nucleic acid modifications by chemical derivatization-MSanalysis in the future.However,there are still many challenges to the effective determination of DNA and RNA modifications.It should be noted that the limitations of derivatization-based MSinclude by-product formation and interference caused by excess derivatization reagents.Therefore,the development of new derivatization regents and reactions are still desired for the sensitive and selective detection of diverse nucleic acid modifications by MS.Analysis of the chemical structures of DNA and RNA modifications w ill provide useful guide to prepare appropriate derivatization reagents.The development of new derivatization reactions may lead to the discovery of novel DNA and RNA modifications that w ill provide valuable clues underlying association of nucleic acid modifications with diseases.
In addition to decipher the modifications in DNA and RNA,chemical derivatization in combination with MSanalysis can also be expanded to various research fields,such as proteomics study,metabolomics study,and MSimaging.
Acknowledgm ents
The work is supported by the National Key R&D Program of China(No.2017YFC0906800),and the National Natural Science Foundation of China(Nos.21522507,21672166,21635006 and 21721005).
Chinese Chemical Letters2019年1期