Normalizing GC content (ΔGC) produces a pronounced pattern: small ERs display wide GC variation, whereas large ERs cluster tightly around ΔGC ≈ 0, consistent with earlier observations that large ERs tend to converge toward the host chromosome in base composition (Harrison et al., 2010). In contrast, normalizing ER size by chromosome size (%chr) has a much weaker influence on data structure, but is retained to provide a standardized frame of reference across genomes that differ widely in chromosome length (Maddamsetti et al., 2025).
These trends are compatible with our similarity analyses, which showed strong GC similarity between ERs and their host chromosomes but no systematic relationship in size.
For each eligible taxonomic group, we provide two interactive scatter + KDE plots: one using linear scaling and one using log10-transformed %chr. All output files are organized below.
Harrison PW et al. 2010. Trends in Microbiology 18:141–148.
Maddamsetti R et al. 2025. Nature Communications 16:371.