基于第二代测序技术的植物遗传和表观遗传学研究

基于第二代测序技术的植物遗传和表观遗传学研究

摘要

继链终止法和化学测序法等第一代测序技术后,实现了高通量平行测序的第二代测序技术于2004年开始商业化。因为读长短的特性,它很适合应用于对生物体内天然存在的小RNA进行深度测序;随着读长的逐渐增加和将短序列比对回基因组算法的日益成熟,现在它广泛地应用于涉及全基因组水平分析的生物学研究。比如在植物遗传学研究中,通过对不同群体或自然突变体的全基因组重测序,它可以作为一种对各种表型特别是数量性状(QTL)进行正向定位和全基因关联分析的新技术手段,加速对表型关联基因所在区段的鉴定。在表观遗传学研究中,通过抗体将组蛋白结合的DNA富集后进行测序,或者用重亚硫酸盐(Bisulfite)试剂处理后进行测序,这样在不同发育阶段或生长环境下,不同修饰的组蛋白在染色体上的变化和全基因组水平的DNA甲基化等表观修饰得以被动态地观察;对各种RNA的测序发现了越来越多的起着转录或转录后调控的miRNA和功能siRNA,对转录组的测序还揭示一个基因位点普遍存在不同的剪接体且基因组很多非编码区也存在潜在的转录活性。另外作为研究蛋白与DNA相互作用的一种方法,通过ChiP-seq,各种关键的转录因子结合的特定顺式元件也被陆续鉴定。二代测序产生了海量的数据,相应的算法和软件也在不断更新优化。



关键词:第二代测序技术 高通量 遗传 表观遗传  



Abstract

After the first generation sequencing technology of chain termination method and chemical sequencing method, the second generation sequencing technology which could be massively parallel sequencing became commercially available in 2004. Because its character of short reads, it is suitable to perform the small RNA deep sequencing which are natural exist in biological organism. With the increase of reads length and the algorithm of short reads aligning to the genome sequence, nowadays it widely apply to the biological research which refer to analysis in genome-wide level. For example, in the plant genetics research, as a new method for mapping or genome-wide associated study with the phenotype especially quantitative trait, it accelerates the detection of phenotype-associated physical location, through the genome re-sequencing of different group or nature mutants. In the plant epigenetic research, DNA combined by different mortified histones are collected by specific antibody, or the DNA treated with Bisulfite are recycled; after sequencing, the changes of mortified histone in chromosome and whole-genome DNA methylation in different development period or environments could be observed. RNA sequencing uncovers more and more miRNA and functional siRNA playing the role in transcriptional or post-transcriptional regulation; the transcritome suggests most gene loci could produce different splicing transcripts, and multiple non-coding regions have the potential transcript activity. Finally, as a tool for detecting the interaction between DNA and protein, many cis-elements as the motif of important transcription factor are identified through ChiP-sequencing. Massive data are generated by the second generation sequencing technology, and the related algorithm and software are also updated and optimized.  



Key words: second generation sequencing technology, high-throughput, genetics, epigenetics