“…Contigs annotated with known micro-organisms, and probable parasites such as viruses and fungi, were first filtered out from the data as exogenous material, as described in Johansson et al (2013) , and this exogenous material has now been made available in NCBI database (Bioproject: PRJNA412141 ). The remaining contigs were then aligned (BLASTx, e -value threshold 10 −3 ) to (a) the predicted gene sets of published ant genomes available on the Fourmidable database, including the closest species such as Camponotus floridanus , Lasius niger, as well as more distantly related species such as Cerapachys biroi , Linepithema humile , Solenopsis invicta , Vollenhovia emeryi , Wasmannia auropunctata , Pogonomyrmex barbatus , Monomorium pharaonis , Harpegnathos saltator , Acromyrmex echinatior and Atta cephalotes ( Wurm et al, 2011 ; Wurm et al, 2009 ; Bonasio et al, 2010 ; Suen et al, 2011 ; Smith et al, 2011a ; Smith et al, 2011b ; Nygaard et al, 2011 ; Gadau et al, 2012 ; Oxley et al, 2014 ; Schrader et al, 2014 ; Konorov et al, 2017 ), (b) the honey bee genome ( Weinstock et al, 2006 ), (c) the predicted gene sets of Nasonia vitripennis , Tribolium castaneum ( Wurm et al, 2009 ), and Drosophila melanogaster ( Gramates et al, 2017 ), (d) as well as non-redundant (NR) protein datasets available in the NCBI database (Updated 2015, Updated 2017) for insect species. A minimum requirement of 70% amino acid identity, and at least 100 bases (33 amino acids) alignment length was used.…”