本文排版定稿已在中国知网网络首发,如需阅读全文请打开知网首页,并搜索该论文题目即可查看。
基于丹参转录组密码子的使用偏好性分析
Analysis of Codon Usage Bias in Salviae miltiorrhizae Based on Transcriptome Data
-
摘要: 为揭借助Codon W软件与Perl程序,对丹参高质量编码序列(CDS)开展系统分析,研究丹参转录组密码子使用偏好性特征。结果表明:丹参转录组编码序列GC含量为25.74%~71.69%,平均48.61%,密码子不同位点GC含量分布不均衡,密码子第1位碱基(GC1)与密码子第3位碱基(GC3)平均含量接近且显著高于密码子第2位碱基(GC2);通过中性绘图及ENC图的特征,证实自然选择是丹参密码子使用偏好的核心驱动力,对应性分析(COA)明确了基因组碱基组成的调控作用;丹参密码子适应指数(CAI)区间为0.073~0.525,有效密码子数(ENC)区间为24.18~61.00,提示整体密码子使用偏好较弱。通过相对同义密码子使用情况(RSCU)分析共筛选出18个最优密码子(如GCA、GCU、AGA等),除UUG外均以A/U结尾。最终明确了丹参转录组密码子使用特征及主导驱动因素。Abstract: To reveal the characteristics of codon usage bias in the transcriptome of S. miltiorrhiza systematic analyses were performed on high-quality coding sequences (CDS) of S. miltiorrhiza with the aid of Codon W software and Perl scripts. The results showed that the overall guanine-cytosine (GC) content of the coding sequences in the S. miltiorrhiza transcriptome ranged from 25.74% to 71.69%, with an average of 48.61%. The GC content exhibited an unbalanced distribution across different codon positions: the mean GC content at the first codon position (GC1) and the third codon position (GC3) were close and significantly higher than that at the second codon position (GC2). Based on the characteristics of neutrality plot and ENC plot, natural selection was confirmed to be the core driving force shaping the codon usage bias of S. miltiorrhiza. Correspondence analysis (COA) further clarified the regulatory role of genomic base composition in codon usage patterns. The codon adaptation index (CAI) of S. miltiorrhiza ranged from 0.073 to 0.525, and the effective number of codons (ENC) varied between 24.18 and 61.00, indicating a weak overall codon usage bias in the species. Through the analysis of relative synonymous codon usage (RSCU), a total of 18 optimal codons (e.g GCA, GCU, AGA) were identified, all of which ended with A or U bases except for UUG. This study clarified the characteristics of codon usage and the dominant driving factors in the S. miltiorrhiza transcriptome, thereby providing a theoretical basis for its germplasm identification, genetic engineering improvement, and optimization of heterologous expression vectors.
下载: