Speakers

Eugene Korotkov
Eugene Korotkov
Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Russia

Title: Search for dispersed repeats in bacterial genomes

Abstract:

We have developed a de novo method for the identification of dispersed repeats based on the use of random position-weight matrices (PWMs) and an iterative procedure (IP). The created algorithm (IP method) allows detection of dispersed repeats for which the average number of substitutions between any two repeats per nucleotide (x) is less than or equal to 1.5.  IP method made it possible to detect families of dispersed repeats in bacterial genomes which have not been previously reported. We applied this method to find dispersed repeats in the genomes of E. coli and nine other bacterial species and could identify some repeat families comprising over 103 repeats with lengths between 400 and 600 bases. In E. coli, the identified repeat families occupy about half of the genome. Such extensive repeat families could not be detected in the E. coli genome by using the RED, RECON, or Repeat_masker programs but only by the IP method, which could find de novo repeat families with x ≤ 1.5, whereas all other programs could do it with x ≤ 1.0. Since families of dispersed repeats we found not only in the genome of E. coli but also in  nine bacterial species, it is also possible that the detected families of repeats could be involved in the creation of the liquid crystal structure within bacterial DNA through interactions between repeats within a family.

Biography:

Eugene Korotkov is a Professor at the Department of Applied Mathematics in Moscow Engineering Physics Institute and Principal Investigator in Bioinformatics Department of  Bioengineering Centre, Russian Academy of Sciences. He graduated from  the National Nuclear Research University (MEPI), Department of Experimental and Theoretical Physics in 1974. Then, from 1980 to 1995 Korotkov EV  worked at the Institute of Chemical Physics, NN Semenov and began work on the development of mathematical algorithms to study the symbolic sequences