本文排版定稿已在中国知网网络首发,如需阅读全文请打开知网首页,并搜索该论文题目即可查看。

3种毛蕊花属植物叶绿体基因组密码子偏好性研究

Analysis of Codon Bias in the Chloroplast Genome of three of Verbascum species

  • 摘要: 为探究毛蕊花属植物叶绿体基因组的密码子使用偏性及其影响偏性主要的因素,从NCBI数据库中下载得到毛蕊花、紫毛蕊花、准噶尔毛蕊花的叶绿体基因组蛋白编码序列(CDS),使用Geneious v.7.1.3生物信息学软件挑选剔除重复、长度小于300 bp的CDS用于后续分析。利用Codon W 1.4.2、CUSP及SPSS 26等软件分析有效密码子数(ENC)、同义密码子相对使用度(RSCU)、基因密码子第1、2、3位碱基的GC含量(分别表示为GC1, GC2, GC3)及平均GC含量(GCall)等指标,并通过中性绘图分析、PR2-plot分析、ENC-plot分析的多元统计,对3种毛蕊花属植物叶绿体基因组密码子的使用模式偏好性进行分析并筛选最优密码子结果表明:3种毛蕊花属植物的叶绿体全基因组分别长153338153348153291 bp,3种毛蕊花属植物蛋白编码基因的ENC值均超过35,说明3种毛蕊花属植物叶绿体基因组的密码子偏好性较弱;密码子各位置的平均GC含量(GCall)分别为38.31%、38.00%、38.00%,且毛蕊花GC1(46.83%) > GC2(39.63%) > GC3(28.49%);同理,紫毛蕊花GC1(46.12%) > GC2(38.40%) > GC3(29.50%);准噶尔毛蕊花GC1(46.07%) > GC2(38.38%) > GC3(29.56%),表明GC含量在不同物种不同位置的分布不均匀。自然选择是影响3种毛蕊花属植物叶绿体基因组密码子偏好性的主要因素。此外,利用RSCU和ENC值在毛蕊花、紫毛蕊花、准噶尔毛蕊花中分别筛选出11、15和11个最优密码子,其中共同的最优密码子有9个,分别是AUA、UCC、GCC、AAU、GAU、UGC、UGA、CGU、AGU,且大部分以A/U结尾。

     

    Abstract: Exploring the codon usage bias and its main influencing factors in the chloroplast genome of Verbascum plants, providing a theoretical basis for the development and optimization of chloroplast genetic engineering in Verbascum plants. The chloroplast genome protein-coding sequences (CDS) of Verbascum thapsus, Verbascum phoeniceum and Verbascum songaricum were downloaded from the NCBI database, and the complete sequence of CDS, excluding duplicates and that less than 300 bp in length, was selected for further analyses using Geneious v.7.1.3 bioinformatics software. Codon W 1.4.2, CUSP and SPSS 26 were used to analyse the effective codon count (ENC), the relative usage of synonymous codons (RSCU), the GC content of bases 1, 2, and 3 of the gene codon (denoted as GC1, GC2, and GC3, respectively) and the average GC content (GCall), and were subjected to neutral plot analysis and PR2-plot analysis. Multivariate statistics from neutral plot analysis, PR2-plot analysis, and ENC-plot analysis were employed to predict the codon usage preference patterns in the chloroplast genomes of three species of Verbascum and to screen the optimal codons. The results showed that the complete chloroplast genomes of the three species of Verbascum had sequence lengths of 153338 bp, 153348 bp, and 153291 bp, respectively, and the ENC values of the protein-coding genes of the three species of Verbascum exceeded 35, indicating that the codon preference of the chloroplast genomes of the three species of Verbascum was weak; and the average GC content at each position of the codon (GCall) was 38.31%, 38.00%, 38.00%, respectively. In addition, GC1 (46.83%) > GC2 (39.63%) > GC3 (28.49%) in Verbascum thapsus; similarly, GC1 (46.12%) > GC2 (38.40%) > GC3 (29.50%) in Verbascum phoeniceum; and GC1 (46.07%) > GC2 (38.38%) > GC3 in Verbascum songaricum, indicating that the distribution of GC content was not uniform in different locations of different species. Neutral mapping analysis, ENC-plot analysis, and PR2-plot analysis indicated that natural selection was the primary factor influencing codon preference in the chloroplast genomes of the three Verbascum species. Furthermore, the RSCU and ENC values were used to identify 11, 15, and 11 optimal codons in Verbascum thapsus, Verbascum phoeniceum, Verbascum songaricum, respectively, among which there were nine common optimal codons, namely AUA, UCC, GCC, AAU, GAU, UGC, UGA, CGU, AGU, and most of codons preference ends with A/U.

     

/

返回文章
返回