| AAPPS Bulletin Vol.13 No.2 April 2003 |
Articles
Genomes Are Large Systems with Small-system Statistics: Segmental Duplications in the Growth of Microbial Genomes(PDF,984KB)
Li-Ching Hsieh, Liaofu Luo, and H. C. Lee ,
National Central University, Taiwan, Inner Mongolia University, China, Université de Montréal, Canada
abstract---
We show that textual analysis of microbial genomes reveal telling footprints of the early evolution
of the genomes. The frequencies of word occurrence of random DNA sequences considered as
texts in their four nucleotides are expected to obey Poisson distributions. It is noticed that for
words less than nine letters the average width of the distributions for complete microbial genomes
is many times that of a Poisson distribution. We interpret this phenomenon as follows: the genome
is a large system that possesses the statistical characteristics of a much smaller “random”
system, and certain textual statistical properties of genomes we now see are remnants of those of
their ancestral genomes, which were much shorter than the genomes are now. This interpretation
suggests a simple biologically plausible model for the growth of genomes: the genome first grows
randomly to an initial length of approximately one thousand nucleotides (1k nt), or about one
thousandth of its final length, thereafter mainly grows by random segmental duplication. We
show that using duplicated segments averaging around 25 nt, the model sequences generated
possess statistical properties characteristic of present day genomes. Both the initial length and
the duplicated segment length support an RNA world at the time duplication began. Random
segmental duplication would greatly enhance the ability of a genome to use its hard-to-acquire
codes repeatedly, and a genome that practiced it would have evolved enormously faster than those
that did not. |
|
|
|
|