hide
Free keywords:
-
Abstract:
Bacterial insertion sequences (ISs) are selfish genetic elements, that are able to spread inside a host genome by independent replication and transposition. In the very compact bacterial genomes, transposition frequently leads to massive fitness defects from insertion into genes essential for host survival. Many studies suggest that, for long term persistence in a bacterial population, ISs rely on horizontal genetransfer (HGT) to constantly invade new host genomes. Thus, in general, the proportion of infected genomes stay slow and only some genomes carry few or many copies of IS-elements. Such a distribution, however, has been shown only for a few IS-families. Here I show that IS copy number distributions fall in at least two very distinct categories. I studied the distribution of IS-elements from eight different IS-families (IS1, IS110, IS1341, IS200, IS21, IS3, IS5, ISAs1) in a set of 300 E. coli genomes. Some families show an L-shaped distribution with a high proportion of uninfected genomes and few genomes carrying multiple copies. Three IS-families were Poisson-like distributed with the majority of genomes carrying some copies and the total absence of genomes with high copy numbers (more than seven). The L-shaped distributions are the result of high fitness costs per IS copy and strong HGT. The Poisson-like distributions, however, can presumably only be explained with additional parameters like transposase downregulation or fitness benefits.