Past Issues

Studies in Informatics and Control
Vol. 20, No. 2, 2011

Numerical Representations Involved in DNA Repeats Detection Using Spectral Analysis

Petre G. Pop, Alin Voina
Abstract

Sequence repeats are the simplest form of regularity and the detection of repeats is important in biology and medicine as it can be used for phylogenic studies and disease diagnosis. A major difficulty in identification of repeats is caused by the fact that the repeat units can be of unknown length and either exact or imperfect, in tandem or dispersed. Many of the methods for detecting repeated sequences are part of the digital signal processing (DSP) field. These methods involve a transformation which has as main goal the mapping of the symbolic domain into the numeric domain without adding structure information to the symbolic sequence beyond that inherent to it. Therefore, the numerical representation of genomic signals is very important. This paper presents the results obtained by using different numerical representations (including two novel) and spectral analysis to isolate the position and length of DNA repeats in short sequences containing microsatellites and on long sequences with alpha DNA repeats.

Keywords

genomic signal processing, sequence repeats, DNA representations, Fourier analysis, spectrograms.

View full article