The biology of bacterial cells is, in general, based on information encoded on circular chromosomes. Regulation of chromosome replication is an essential process that mostly takes place at the origin of replication (oriC), a locus unique per chromosome. Identification of high numbers of oriC is a prerequisite for systematic studies that could lead to insights into oriC functioning as well as the identification of novel drug targets for antibiotic development. Current methods for identifying oriC sequences rely on chromosome-wide nucleotide disparities and are therefore limited to fully sequenced genomes, leaving a large number of genomic fragments unstudied. Here, we present gammaBOriS (Gammaproteobacterial oriC Searcher), which identifies oriC sequences on gammaproteobacterial chromosomal fragments. It does so by employing motif-based machine learning methods. Using gammaBOriS, we created BOriS DB, which currently contains 25,827 gammaproteobacterial oriC sequences from 1,217 species, thus making it the largest available database for oriC sequences to date. Furthermore, we present gammaBOriTax, a machine-learning based approach for taxonomic classification of oriC sequences, which was trained on the sequences in BOriS DB. Finally, we extracted the motifs relevant for identification and classification decisions of the models. Our results suggest that machine learning sequence classification approaches can offer great support in functional motif identification. Before every cell division, bacteria need to duplicate their genetic material to ensure that this information can faithfully be passed on to both daughter cells. This essential process, called DNA replication, initiates in a highly regulated manner at chromosomal sites called oriC and is coordinated with many other cellular processes 1,2. With notable exceptions as e.G. Vibrionales, usually, bacteria contain one or multiple copies of a single chromosome, which carries a single oriC sequence 3,4. Since many different proteins need to bind to and act upon oriC for initiation to occur, oriC contains many protein binding sites and DNA motifs 5,6. While there is a high level of variation between oriC sequences of different organisms, there are also nearly universally occurring DNA motifs in oriC sequences 7-9. Central among these are 9 bp short DNA motifs called DnaA boxes, which act as binding sites for the initiator protein DnaA, and exhibit differing protein binding characteristics depending on the exact sequence. Starting from these motifs, DnaA polymerizes and spreads across multiple DnaA boxes and DnaA trio motifs 10 , which then, in interplay with the protein IHF 11 , leads to double helix unwinding at a closely positioned AT-rich region called DNA unwinding element (DUE) so that the replication machinery can be loaded onto the DNA 12,13. As oriC contains binding sites for proteins that relay information on the status of the cell, it can be considered as a biological information compiler and processor 14,15. Taken together, these properties make oriC sequ...