Motivation:The amount of sequencing data from High-Throughput Sequencing technologies grows at a pace exceeding the one predicted by Moore's law. One of the basic requirements is to efficiently store and transmit such huge collections of data. Despite significant interest in designing FASTQ compressors, they are still imperfect in terms of compression ratio or decompression resources. Results: We present Pseudogenome-based Read Compressor (PgRC), an in-memory algorithm for compressing the DNA stream, based on the idea of building an approximation of the shortest common superstring over high-quality reads. Experiments show that PgRC wins in compression ratio over its main competitors, SPRING and Minicom, by up to 18 and 21 percent on average, respectively, while being at least comparably fast in decompression. Availability: PgRC can be downloaded from https://github.com/kowallus/PgRC. Contact: tomasz.kowalski@p.lodz.pl 2Kowalski and Grabowski and distributes them into buckets. Its key concept, however, is to use socalled minimizers (Roberts et al., 2004) for the bucket labels. A minimizer of length for a read R of length m is the lexicographically smallest of the (m − + 1) -mers of R. A canonical minimizer, which is actually used by ORCOM, is a minimizer taken over the read and its reversedcomplemented form. Two reads with a large overlap are likely to share the same (canonical or non-canonical) minimizer and thus the same bucket. The contents of each bucket are compressed separately, with sorting the reads from their minimizer's position, careful modeling of mismatches and other minor improvements, combined with arithmetic coding or PPMd (context-based) compression applied to several resulting data streams. The compression ratio on a 134 Gbp human genome sequencing data achieved by ORCOM was 0.317 bits per base, improving the BEETL's result of 0.518 bits per base. Mince (Patro and Kingsford, 2015) is a related algorithm, but its distribution of reads into buckets is based on the number of shared kmers. More precisely, a read R is thrown to the bucket which maximizes the number of k-mers of R occurring in any read the bucket contains. Its compression ratio is in most cases by a few percent higher than ORCOM's (see, e.g., extensive comparisons in (Liu et al., 2018)), but is less efficient in terms of time and memory usage. FaStore (Roguski et al., 2018) also follows the ORCOM approach, but improves its compression ratio (by a factor of about 1.2 typically) mostly thanks to re-distribution of reads from the buckets and assembling reads into contigs; in other words, it allows to merge similar clusters of reads. FaStore also boasts with good performance-decompression speed exceeding 100 MB/s, and even 250 MB/s in one of the modes, using 8 threads-and several lossy modes for the quality and header streams.HARC (Chandak et al., 2018a) resigns from disk-based bucketing, in favor of a succinct in-memory hash tables. Its basic idea is to find maximum overlaps between reads and create consensus sequences, using majority vot...