In an eukaryotic cell, the mRNA population constitutes approximately 1 of total RNA with the number of transcripts varying from several thousand to several tens of thousands. Normally, the high abundance transcripts (several thousand mRNA copies per cell) of as few as 5-10 genes account for 20% of the cellular mRNA. The intermediate abundance transcripts (several hundred copies per cell) of 500-2000 genes constitute about 40-60% of the cellular mRNA. The remaining 20-40 % of mRNA is represented by rare transcripts (from one to several dozen mRNA copies per cell) (Alberts et al., 1994). Such an enormous difference in abundance complicates large-scale transcriptome analysis, which results in recurrent sequencing of more abundant cDNAs.
cDNA normalization decreases the prevalence of high abundance transcripts and equalizes transcript concentrations in a cDNA sample, thereby dramatically increasing the efficiency of sequencing and rare gene discovery.
Normalization is utilized to enhance the gene discovery rate of a cDNA library and facilitate the identification and analysis of rare transcripts. This approach is imperative for transcriptom sequencing, and useful in other applications, such as functional screening, construction of specific RNA libraries, and Transcript End Sequence Profiling.
cDNA normalization using duplex-specific nuclease (DSN) is a highly efficient approach that can be applied for normalization of full-length-enriched cDNA (Zhulidov et al., 2004; Zhulidov et al., 2005). The resulting cDNA contains equalized abundance of different transcripts and can be used for construction of cDNA libraries and for direct sequencing, including high-throughput sequencing on the next generation sequencing platforms (Roche/454, ABI/SOLiD or Illumina/Solexa).
cDNA normalization using duplex-specific nuclease (DSN) is a highly efficient approach that can be applied for normalization of full-length-enriched cDNA (Zhulidov et al., 2004; Zhulidov et al., 2005). The method is based on nucleic acid hybridization kinetics (Young and Anderson, 1985) and unique properties of the duplex-specific nuclease (DSN) specific to the double-stranded (ds) DNA (Shagin et al., 2002).
DSN-normalization is performed prior to library cloning. After denaturation of ds cDNA flanked with known adapters, it is subjected to renaturation. During renaturation, abundant transcripts convert to the ds form more effectively than those that are less frequent. Thus, two fractions are formed, specifically, a ds-fraction of abundant cDNA and a normalized single-stranded (ss) cDNA. The ds cDNA fraction is then degraded by DSN.
Schematic outline of DSN-normalization.
Black lines represent abundant transcripts, grey lines – rare transcripts. Rectangle represents adapter sequence and its complement.
DSN is an enzyme from Red King (Kamchatka) crab that displays a strong preference for cleaving ds DNA compared to ss-DNA and RNA, irrespective of the sequence length. DSN is stable under elevated temperatures (Shagin et al., 2002). Maximal DSN activity is observed at 60-65°C, and about 25% of the activity is retained after incubation at 70°C for 20 min. Owing to DSN thermostability, ds DNA degradation is performed under conditions of cDNA renaturation that prevent the formation of secondary structures and non-specific hybridization involving adapter sequences within the ss cDNA fraction.
The remaining normalized ss DNA is amplified by PCR. PCR primers and conditions are optimized to minimize the PCR tendency to amplify shorter fragments more efficiently than longer ones. Normalized cDNA can then be used for library cloning or sequencing.
cDNA suitable for normalization can be prepared on the basis of total or poly(A)+ RNA and should contain known adapter sequences at both ends for PCR amplification. The quality of the RNA is crucial, especially when construction of full-length enriched cDNA library is a goal. The flanking sequences can be introduced to the cDNA ends during cDNA synthesis by various means, for example, adapter ligation or during cDNA synthesis using the template-switching approach.
DSN-normalization has been successfully applied to various animal and plant models (see Bogdanova et al., 2008 for review). The flexibility of this normalization procedure allows simple modifications for various purposes. Detailed protocols of DSN-normalization modifications are available in the book Nucleic Acids Hybridization Modern Applications (Shcheglov et al., 2007), and Current Protocols (Bogdanova et al., 2010).
Normalization significantly increases gene discovery rate of cDNA library.
Transcript distribution in standard and normalized cDNA libraries from Aplysia neurons: blue – unique sequences; green – non-unique sequences, yellow – all sequences.
Typical cDNA normalization result.
(A) Agarose/EtBr gel-electrophoresis of non-normalized (1) and normalized (2) human cDNA samples; (B, C) concentration of abundant transcripts in these samples revealed by Virtual Northern blot.
ACTB – β-actin; UBC – ubiquitin C; M, 1-kb DNA size markers (SibEnzyme); embr. – embryonic.