Methods of transcript assembly and reduction filters are compared for recovery of reference gene sets of human, pig and plant, including longest coding sequence with EvidentialGene, longest transcript with CD-HIT, and most RNA-seq with TransRate. EvidentialGene methods are the most accurate in recovering reference genes, and maintain accuracy for alternate transcripts and paralogs. In comparison, filtering large over-assemblies by longest RNA measures, and most RNA-seq expression measures, discards a large portion of accurate models, especially alternates and paralogs. Accuracy of protein calculations is compared, with errors found in popular methods, as is accuracy of transcript assemblers. Gene reconstruction accuracy depends upon the underlying measurements, where protein criteria, including homology among species, have the strength of evolutionary biology that other criteria lack. EvidentialGene provides a gene reconstruction algorithm that is consistent with genome biology.Accurate gene set reconstruction 2019 October p. 1Some results of this comparison are obvious : longest transcript filter has longer transcripts, longest CDS filter recovers longer proteins, and most RNA-seq filter yields greater expression measures, compared to the others. The underlying question is which approach returns the most accurate gene information, consistent with efficient reduction to levels at which external evidence can be applied? Where results of these reduction filters differ, the one with greater biological information and phylogenetic validation, is presumed to be of more interest and utility to biologists.Proteins are evolutionarily conserved, functionally understandable biological information. The biological meaning of coding genes is in their coding sequence, so that discrepancies in CDS versus RNA quality measures favor the CDS measure. RNA-seq expression measures have technical imprecisions, with less direct biological meaning when these qualities deviate from coding sequence quality. The corre-Accurate gene set reconstruction 2019 October p. 2 spondence of protein-related quality measures, including protein size and homology, to biological protein recovery, via proteomics experiments, is known to be well above the correspondence with of expression quality measures (Tress et al. 2017).This report details use of these three filters to select accurate and complete gene sets from supersets of gene models that contain many accurate genes, plus redundant and less accurate models. Important as well, accurate coding sequence translation is discussed, and the value of several self-referential quality measures for accurate gene set reconstruction. Not considered here are chromosomal evidence, details of homology and external evidence, nor methods of non-coding gene validation. Those are important for accurate gene set reconstruction, and can be applied to the limited-palette results of self-referential draft gene sets. Self-referential gene set reconstruction, when done properly, is an efficient, data-intensive, first step i...