Conspectus
In biological systems, the storage and transfer
of genetic information
rely on sequence-controlled nucleic acids such as DNA and RNA. It
has been realized for quite some time that this property is not only
crucial for life but could also be very useful in human applications.
For instance, DNA has been actively investigated as a digital storage
medium over the past decade. Indeed, the “hard-disk of life”
is an obvious choice and a highly optimized material for storing data.
Through decades of nucleic acids research, technological tools for
parallel synthesis and sequencing of DNA have been readily available.
Consequently, it has already been demonstrated that different types
of documents (e.g., texts, images, videos, and industrial data) can
be stored in chemically synthesized DNA libraries. However, DNA is
subject to biological constraints, and its molecular structure cannot
be easily varied to match technological needs. In fact, DNA is not
the only macromolecule that enables data storage. In recent years,
it has been demonstrated that a wide variety of synthetic polymers
can also be used for such a purpose. Indeed, modern polymer synthesis
allows the preparation of synthetic macromolecules with precisely
controlled monomer sequences. Altogether, about a dozens of synthetic
digital polymers have already been described, and many more can be
foreseen. Among them, sequence-defined poly(phosphodiester)s
are one of the most promising options. These polymers are prepared
by stepwise phosphoramidite chemistry like chemically synthesized
oligonucleotides. However, they are constructed with non-natural building
blocks and therefore share almost no structural characteristics with
nucleic acids, except phosphate repeat units. Still, they contain
readable digital messages that can be deciphered by nanopore sequencing
or mass spectrometry sequencing. In this Account, we describe our
recent research efforts in synthesizing and sequencing optimal abiological
digital poly(phosphodiester)s. A major advantage of these polymers
over DNA is that their molecular structure can easily be varied to
tune their properties. During the last 5 years, we have engineered
the molecular structure of these polymers to adjust crucial parameters
such as the storage density, storage capacity, erasability, and readability.
Consequently, high-capacity PPDE chains, containing hundreds of bits
per chains, can now be synthesized and efficiently sequenced using
a routine mass spectrometer. Furthermore, sequencing data can be automatically
decrypted with the help of decoding software. This new type of coded
matter can also be edited using practical physical triggers such as
light and organized in space by programmed self-assembly. All of these
recent improvements are summarized and discussed herein.