Assessing ER–chromosome GC convergence across phyla

ERs evolve within the same genomic environment as their host chromosomes and may therefore reflect shared genomic constraints. To evaluate this, we quantified whether ERs more closely resemble the GC content of their own chromosome than that of chromosomes from unrelated genomes.

For each ER, we identified the largest chromosome of its genome and calculated the absolute GC difference (|GCER − GCchr|). These matched differences were compared to a null distribution generated by pairing each ER with a randomly selected chromosome from a different genome (unmatched).

Matched and unmatched GC differences were compared using a one-sided Mann–Whitney U test (matched < unmatched). Effect sizes were quantified using Cliff’s delta, allowing assessment of the magnitude of separation between the two distributions. Analyses were performed globally and for each phylum containing at least 100 ERs.

Summary table

scopeGC_median_matchedGC_median_unmatchedGC_UGC_pGC_significanceGC_cliffs_deltaGC_effect_strength
Global3.00717000000000210.9297555649966.50.0statistically significant-0.628723160396315a large effect
Actinomycetota2.24719999999999944.3787500000000041030527.02.0302003494966222e-88statistically significant-0.3805022939750693a moderate effect
Bacillota2.9841649999999994.84961000000000238293239.01.4545971912245e-310statistically significant-0.30055006198622447a moderate effect
Bacteroidota2.28224999999999777.45773500000000374544.01.0340660138907335e-61statistically significant-0.5598540404577178a large effect
Campylobacterota2.5667499999999994.09322999999999783109.01.5780901068639004e-16statistically significant-0.302027344799785a moderate effect
Cyanobacteriota1.06514999999999924.81091500000000147887.53.1778282238487873e-58statistically significant-0.5945104912868973a large effect
Deinococcota1.3014000000000013.258060000000000416720.02.1159881935209514e-08statistically significant-0.30276683138383265a moderate effect
Pseudomonadota3.2316500000000097.2021799999999985419902480.50.0statistically significant-0.4406260076294791a moderate effect
Spirochaetota1.37854999999999884.7081512120.08.160252625129897e-146statistically significant-0.5423446481169035a large effect

Global GC similarity

The test is statistically significant (p = 0.0e+00), and the effect size indicates a large effect (Cliff’s delta = -0.629).

Phylum-level GC similarity

Phylum: Actinomycetota

GC content similarity

The test is statistically significant (p = 2.0e-88), and the effect size indicates a moderate effect (Cliff’s delta = -0.381).
Phylum: Bacillota

GC content similarity

The test is statistically significant (p = 1.5e-310), and the effect size indicates a moderate effect (Cliff’s delta = -0.301).
Phylum: Bacteroidota

GC content similarity

The test is statistically significant (p = 1.0e-61), and the effect size indicates a large effect (Cliff’s delta = -0.560).
Phylum: Campylobacterota

GC content similarity

The test is statistically significant (p = 1.6e-16), and the effect size indicates a moderate effect (Cliff’s delta = -0.302).
Phylum: Cyanobacteriota

GC content similarity

The test is statistically significant (p = 3.2e-58), and the effect size indicates a large effect (Cliff’s delta = -0.595).
Phylum: Deinococcota

GC content similarity

The test is statistically significant (p = 2.1e-08), and the effect size indicates a moderate effect (Cliff’s delta = -0.303).
Phylum: Pseudomonadota

GC content similarity

The test is statistically significant (p = 0.0e+00), and the effect size indicates a moderate effect (Cliff’s delta = -0.441).
Phylum: Spirochaetota

GC content similarity

The test is statistically significant (p = 8.2e-146), and the effect size indicates a large effect (Cliff’s delta = -0.542).