Background/objectives: Codon usage bias affects gene expression and translation efficiency across species. The effective number of codons (ENC) and GC content influence codon preference, often displaying unimodal or bimodal distributions. This study investigates the correlation between ENC and GC rankings across species and how their relationship affects codon usage distributions.
Methods: I analyzed nuclear-encoded genes from 17 species representing six kingdoms: one bacteria (Escherichia coli), three fungi (Saccharomyces cerevisiae, Neurospora crassa, and Schizosaccharomyces pombe), one archaea (Methanococcus aeolicus), three protists (Rickettsia hoogstraalii, Dictyostelium discoideum, and Plasmodium falciparum),), three plants (Musa acuminata, Oryza sativa, and Arabidopsis thaliana), and six animals (Anopheles gambiae, Apis mellifera, Polistes canadensis, Mus musculus, Homo sapiens, and Takifugu rubripes). Genes in all 17 species were ranked by GC content and ENC, and correlations were assessed. I examined how adding or subtracting these rankings influenced their overall distribution in a new method that I call Two-Rank Order Normalization or TRON. The equation, TRON = SUM(ABS((GC rank1:GC rankN) - (ENC rank1:ENC rankN))/(N2/3), where (GC rank1:GC rankN) is a rank-order series of GC rank, (ENC rank1:ENC rankN) is a rank-order series ENC rank, sorted by the rank-order series GC rank. The denominator of TRON, N2/3, is the normalization factor because it is the expected value of the sum of the absolute value of GC rank-ENC rank for all genes if GC rank and ENC rank are not correlated.
Results: ENC and GC rankings are positively correlated (i.e., ENC increases as GC increases) in AT-rich species such as honeybees (R2 = 0.60, slope = 0.78) and wasps (R2 = 0.52, slope = 0.72) and negatively correlated (i.e., ENC decreases as GC increases) in GC-rich species such as humans (R2 = 0.38, slope = -0.61) and rice (R2 = 0.59, slope = -0.77). Second, the GC rank-ENC rank distributions change from unimodal to bimodal as GC content increases in the 17 species. Third, the GC rank+ENC rank distributions change from bimodal to unimodal as GC content increases in the 17 species. Fourth, the slopes of the correlations (GC versus ENC) in all 17 species are negatively correlated with TRON (R2 = 0.98) (see Graphic Abstract).
Conclusions: The correlation between ENC rank and GC rank differs among species, shaping codon usage distributions in opposite ways depending on whether a species' nuclear-encoded genes are AT-rich or GC-rich. Understanding these patterns might provide insights into translation efficiency, epigenetics mediated by CpG DNA methylation, epitranscriptomics of RNA modifications, RNA secondary structures, evolutionary pressures, and potential applications in genetic engineering and biotechnology.
Increase the total number of rows showing on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.
Evidence ID | Analyze ID | Gene/Complex | Systematic Name/Complex Accession | Qualifier | Gene Ontology Term ID | Gene Ontology Term | Aspect | Annotation Extension | Evidence | Method | Source | Assigned On | Reference |
---|
Increase the total number of rows showing on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table; click on the small "i" buttons located within a cell for an annotation to view further details.
Evidence ID | Analyze ID | Gene | Gene Systematic Name | Phenotype | Experiment Type | Experiment Type Category | Mutant Information | Strain Background | Chemical | Details | Reference |
---|
Increase the total number of rows showing on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.
Evidence ID | Analyze ID | Gene | Gene Systematic Name | Disease Ontology Term | Disease Ontology Term ID | Qualifier | Evidence | Method | Source | Assigned On | Reference |
---|
Increase the total number of rows displayed on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; to filter the table by a specific experiment type, type a keyword into the Filter box (for example, “microarray”); download this table as a .txt file using the Download button or click Analyze to further view and analyze the list of target genes using GO Term Finder, GO Slim Mapper, or SPELL.
Evidence ID | Analyze ID | Regulator | Regulator Systematic Name | Target | Target Systematic Name | Direction | Regulation of | Happens During | Regulator Type | Direction | Regulation Of | Happens During | Method | Evidence | Strain Background | Reference |
---|
Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through its pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.
Site | Modification | Modifier | Source | Reference |
---|
Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table; click on the small "i" buttons located within a cell for an annotation to view further details about experiment type and any other genes involved in the interaction.
Evidence ID | Analyze ID | Interactor | Interactor Systematic Name | Interactor | Interactor Systematic Name | Allele | Assay | Annotation | Action | Phenotype | SGA score | P-value | Source | Reference | Note |
---|
Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table; click on the small "i" buttons located within a cell for an annotation to view further details about experiment type and any other genes involved in the interaction.
Evidence ID | Analyze ID | Interactor | Interactor Systematic Name | Interactor | Interactor Systematic Name | Assay | Annotation | Action | Modification | Source | Reference | Note |
---|
Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through its pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.
Complement ID | Locus ID | Gene | Species | Gene ID | Strain background | Direction | Details | Source | Reference |
---|
Increase the total number of rows displayed on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table; download this table as a .txt file using the Download button;
Evidence ID | Analyze ID | Dataset | Description | Keywords | Number of Conditions | Reference |
---|