Персона: Коротков, Евгений Вадимович
Загружается...
Email Address
Birth Date
Научные группы
Организационные подразделения
Организационная единица
Институт лазерных и плазменных технологий
Стратегическая цель Института ЛаПлаз – стать ведущей научной школой и ядром развития инноваций по лазерным, плазменным, радиационным и ускорительным технологиям, с уникальными образовательными программами, востребованными на российском и мировом рынке образовательных услуг.
Статус
Фамилия
Коротков
Имя
Евгений Вадимович
Имя
10 results
Результаты поиска
Теперь показываю 1 - 10 из 10
- ПубликацияОткрытый доступSearch for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes(2019) Suvorova, Y. M.; Skryabin, K. G.; Korotkova, M. A.; Korotkov, E. V.; Короткова, Мария Александровна; Коротков, Евгений Вадимович© The Author(s) 2019. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.A new mathematical method for potential reading frameshift detection in protein-coding sequences (cds) was developed. The algorithm is adjusted to the triplet periodicity of each analysed sequence using dynamic programming and a genetic algorithm. This does not require any preliminary training. Using the developed method, cds from the Arabidopsis thaliana genome were analysed. In total, the algorithm found 9,930 sequences containing one or more potential reading frameshift(s). This is ∼21% of all analysed sequences of the genome. The Type I and Type II error rates were estimated as 11% and 30%, respectively. Similar results were obtained for the genomes of Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Rattus norvegicus and Xenopus tropicalis. Also, the developed algorithm was tested on 17 bacterial genomes. We compared our results with the previously obtained data on the search for potential reading frameshifts in these genomes. This study discussed the possibility that the reading frameshift seems like a relatively frequently encountered mutation; and this mutation could participate in the creation of new genes and proteins.
- ПубликацияОткрытый доступDeveloping mathematical method for multi alignment of DNA sequences with weak similarity(2019) Korotkov, E. V.; Korotkova, M. A.; Коротков, Евгений Вадимович; Короткова, Мария Александровна© 2019 Published under licence by IOP Publishing Ltd.A new mathematical method is proposed for constructing multiple alignment of weakly similar amino acid or nucleotide sequences. The method uses a multiple alignment representation in the form of position-weight matrices (PWM) and global dynamic programming. An optimization procedure for PWM is developed with the aim of finding a multiple alignment with maximum statistical significance. The method allows to find multiple alignment of sequences with the number of nucleotide or amino acid substitutions more than 2.5. The developed approach is applied to obtain multiple alignment of promoter sequences of 600 DNA bases length.
- ПубликацияОткрытый доступSearch for highly divergent tandem repeats in amino acid sequences(2021) Rudenko, V.; Korotkov, E.; Коротков, Евгений Вадимович© 2021 by the authors. Licensee MDPI, Basel, Switzerland.We report a Method to Search for Highly Divergent Tandem Repeats (MSHDTR) in protein sequences which considers pairwise correlations between adjacent residues. MSHDTR was compared with some previously developed methods for searching for tandem repeats (TRs) in amino acid sequences, such as T-REKS and XSTREAM, which focus on the identification of TRs with significant sequence similarity, whereas MSHDTR detects repeats that significantly diverged during evolution, accumulating deletions, insertions, and substitutions. The application of MSHDTR to a search of the Swiss-Prot databank revealed over 15 thousand TR-containing amino acid sequences that were difficult to find using the other methods. Among the detected TRs, the most representative were those with consensus lengths of two and seven residues; these TRs were subjected to cluster analysis and the classes of patterns were identified. All TRs detected in this study have been combined into a databank accessible over the WWW.
- ПубликацияОткрытый доступDetection of highly divergent tandem repeats in the rice genome(2021) Kamionskya, A. M.; Korotkov, E. V.; Korotkova, M. A.; Коротков, Евгений Вадимович; Короткова, Мария Александровна© 2021 by the authors. Licensee MDPI, Basel, Switzerland.Currently, there is a lack of bioinformatics approaches to identify highly divergent tandem repeats (TRs) in eukaryotic genomes. Here, we developed a new mathematical method to search for TRs, which uses a novel algorithm for constructing multiple alignments based on the generation of random position weight matrices (RPWMs), and applied it to detect TRs of 2 to 50 nucleotides long in the rice genome. The RPWM method could find highly divergent TRs in the presence of insertions or deletions. Comparison of the RPWM algorithm with the other methods of TR identification showed that RPWM could detect TRs in which the average number of base substitutions per nucleotide (x) was between 1.5 and 3.2, whereas T-REKS and TRF methods could not detect divergent TRs with x > 1.5. Applied to the search of TRs in the rice genome, the RPWM method revealed that TRs occupied 5% of the genome and that most of them were 2 and 3 bases long. Using RPWM, we also revealed the correlation of TRs with dispersed repeats and transposons, suggesting that some transposons originated from TRs. Thus, the novel RPWM algorithm is an effective tool to search for highly divergent TRs in the genomes.
- ПубликацияОткрытый доступA mathematical method for the classification of promoter sequences from the A.thaliana genome(2020) Kamionskya, A. M.; Korotkova, M. A.; Korotkov, E. V.; Короткова, Мария Александровна; Коротков, Евгений Вадимович© Published under licence by IOP Publishing Ltd.A mathematical method for creating classes of promoter sequences has been developed. The method was used to calculate the classes of promoter sequences from the A.thaliana genome. A total of 16 statistically significant classes of promoter sequences were obtained with class sizes ranging from 8,000 to 100 promoters. The classes obtained allow us to identify potential promoter sequences in various genomes with the number of false positives not exceeding 103 per genome.
- ПубликацияТолько метаданныеSearch for Tandem Repeats in the First Chromosome from the Rice Genome(2020) Kamionskaya, A. M.; Korotkov, E. V.; Korotkova, M. A.; Коротков, Евгений Вадимович; Короткова, Мария Александровна© 2020, Springer Nature Switzerland AG.Using the RPWM method, we searched for tandem repeats of 2 to 50 nucleotides long in the rice genome. We compared the effectiveness of the RPWM method with Mreps, T-reks, Tandem Repeat Finder and ATR Hunter. About 70% of the tandem repeats found could not be found by other algorithms. The correlation of dispersed repeats and transposons with tandem repeats was studied in this work. We assumed that some of the dispersed repeats and transposons originated from tandem repeats
- ПубликацияТолько метаданныеMultiple alignment of promoter sequences from the human genome(2020) Kamionskaya, A. M.; Korotkov, E. V.; Korotkova, M. A.; Коротков, Евгений Вадимович; Короткова, Мария Александровна© 2020.A new algorithm for multiple alignment of nucleotide sequences of MAHDS has been developed. A statistically significant multiple alignment of promoter sequences from the human genome was first created using this algorithm. Based on the constructed alignments, 25 classes of promoter sequences were created with the volume of each class exceeding 100 sequences. The classes of promoters can be used to search for promoter sequences in eukaryotic genomes.
- ПубликацияТолько метаданныеSearch for highly divergent SINE repeats in the rice genome(2020) Suvorova, Y. M.; Kamionskaya, A. M.; Korotkov, E. V.; Коротков, Евгений Вадимович© 2020.In this article, we present a new method for searching for highly divergent copies of SINE elements in a genome. The method is based on the correlation of neighboring symbols both in constructing a positional weight matrix for a sequence of interest and in the genome scanning procedure. This makes possible to increase the alphabet size and, accordingly, the resolution capacity of the method. Using it, we found new copies of SINE repeats in the rice genome that have not been annotated before. The method was tested and compared with the RepeatMasker program; false positives were evaluated.
- ПубликацияТолько метаданныеNew Method for Potential Fusions Detection in Protein-Coding Sequences(2019) Suvorova, Yulia M.; Korotkov, Eugene V.; Коротков, Евгений ВадимовичGene fusion is known to be one of the mechanisms of a new gene formation. Most bioinformatics methods for studying fused genes are based on the sequence similarity search. However, if the ancestral sequences were lost during evolution or changed too much, it is impossible to detect the fusion. Previously, we have developed a method of searching for triplet periodicity (TP) change points in protein-coding sequences (CDS) and showed the possible relation of this phenomenon with gene formation as a result of fusion. In this study, we improved the TP change point detection method and studied the genes of six eukaryotic genomes. At the level of 2%-3% of the probability of type I error, TP change points were found in 20%-40% of genes. Further analysis showed that about 30% of the TP change points can be explained by amino acid repeats. Another 30% can be potentially fused genes, alignment for which was detected by the BLAST program. We believe that the rest of the results can be fused genes, the ancestral sequences for which have been lost. The method is more sensitive to TP changes and allowed us to find up to two to three times more cases of significant TP change points than our previous method.
- ПубликацияТолько метаданныеA Database of Potential Reading Frame Shifts in Coding Sequences from Different Eukaryotic Genomes(2019) Suvorova, Y. M.; Pugacheva, V. M.; Korotkov, E. V.; Коротков, Евгений Вадимович© 2019, Pleiades Publishing, Inc.Abstract—A new data bank containing potential reading frame shifts was developed. A new mathematical method based on the use of the genetic algorithm and dynamic programming was used to search for potential reading frame shifts. The data bank includes coordinates of potential reading frame shifts for coding sequences of 76 eukaryotic genomes from the Ensembl genome browser version 86. The database is located at: http://victoria.biengi.ac.ru/cgi-bin/frameshift/index.cgi. Among all the analyzed genomes approximately 23% of the coding sequences have a reading frame shift. Type I and type II errors are at levels of approximately 11 and 30%. A Web server to search for potential reading frame shifts, which is located at: http://victoria.biengi.ac.ru/fsfinder, was developed simultaneously with the data bank. The server can be used to search for potential reading frame shifts in newly defined coding sequences.