100Effective identification of species using short DNA fragments (DNA barcoding and DNA 101 metabarcoding) requires reliable sequence reference libraries of known taxa. Both 102 taxonomically comprehensive coverage and content quality are important for sufficient 103 accuracy. For aquatic ecosystems in Europe, reliable barcode reference libraries are 104 particularly important if molecular identification tools are to be implemented in biomonitoring 105 and reports in the context of the EU Water Framework Directive (WFD) and the Marine 106Strategy Framework Directive (MSFD). We analysed gaps in the two most important 107 reference databases, Barcode of Life Data Systems (BOLD) and NCBI GenBank, with a 108 focus on the taxa most frequently used in WFD and MSFD. Our analyses show that 109 coverage varies strongly among taxonomic groups, and among geographic regions. In 110 general, groups that were actively targeted in barcode projects (e.g. fish, true bugs, 111 caddisflies and vascular plants) are well represented in the barcode libraries, while others 112 have fewer records (e.g. marine molluscs, ascidians, and freshwater diatoms). We also 113 found that species monitored in several countries often are represented by barcodes in 114 reference libraries, while species monitored in a single country frequently lack sequence 115 records. A large proportion of species (up to 50%) in several taxonomic groups are only 116represented by private data in BOLD. Our results have implications for the future strategy to 117 fill existing gaps in barcode libraries, especially if DNA metabarcoding is to be used in the 118 monitoring of European aquatic biota under the WFD and MSFD. For example, missing 119 species relevant to monitoring in multiple countries should be prioritized. We also discuss 120 why a strategy for quality control and quality assurance of barcode reference libraries is 121 needed and recommend future steps to ensure full utilization of metabarcoding in aquatic 122 biomonitoring. 123 124
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.