“…The statistical and machine-learning-based methods are probably the most frequently used approaches to protein domain predictions, with examples including DGS (Wheelan et al , 2000), DomCut (Suyama and Ohara, 2003), Armadillo (Dumontier et al , 2005), PPRODO (Sim et al , 2005), DOMPro (Cheng et al , 2006), DomNet (Yoo et al , 2008), DROP (Ebina et al , 2011), DOBO (Eickholt et al , 2011), PRODOM (Servant et al , 2002), ADDA (Heger et al , 2005) and EVEREST (Portugaly et al , 2006). In the DGS, DomCut and Armadillo programs, the statistical regularities seen in the Protein Data Bank (PDB) structures, including domain size distribution and residue propensities, are used to deduce the domain linker and boundary predictions.…”