Because the rice genome has been sequenced entirely, search to find specific features at genome-wide scale is of high importance. Palindromic sequences are important DNA motifs involved in the regulation of different cellular processes and are a potential source of genetic instability. In order to search and study the long palindromic regions in the rice genome "R" statistical programming language was used. All palindromes, defined as identical inverted repeats with spacer DNA, could be analyzed and sorted according to their frequency, size, GC content, compact index etc. The results showed that the overall palindrome frequency was high in rice genome (nearly 51000 palindromes), with highest and lowest number of palindromes, respectively belongs to chromosome 1 and 12. Palindrome numbers could well explain the rice chromosome expansion (R 2 >92%). Average GC content of the palindromic sequences is 42.1%, indicating AT-richness and hence, the low-complexity of palindromic sequences. The results also showed different compact indices of palindromes in different chromosomes (43.2 per cM in chromosome 8 and 34.5 per cM in chromosome 3, as highest and lowest, respectively). The possible application of palindrome identification can be the use in the development of a molecular marker system facilitating some genetic studies such as evaluation of genetic variation and gene mapping and also serving as a useful tool in population structure analysis and genome evolution studies. Based on these results it can be concluded that the rice genome is rich in long palindromic sequences that triggered most variation during evolution.
Availability:The R scripts used to construct the palindrome sequence library are available upon request.