Chemical information theory and molecular structure searching have long been used as computational aids
to researchers in the pharmaceutical field to estimate molecular structure−property relationships and to
assist in drug design. Tailored to these and other specific applications, such endeavors have been expensive
to develop and typically are very specialized. Often, they are not readily available and are not a part of the
open literature. Because the number of chemicals in commercial use is growing daily (with over 18 million
molecular species now catalogued by Chemical Abstract Services), there is a need among engineers in the
chemical process industries for predictive structure−property algorithms. The most common and useful
methods are those based on group contribution that require only the chemical structure of interest.
Unfortunately, each group contribution method typically has its own fragment library and specialized rules,
making such models difficult to automate for general use by the engineering community. This work, which
has culminated in the creation of the Molecular Structure Disassembly Program (MOSDAP) software, is
focused on combining and improving upon the best published methods in four areas: (1) lexicographical
entry of structures, (2) prescreening methods, (3) abstract representation of molecular structures, and (4)
structure manipulation routines. Additional features, such as a custom modification of the published Ullman
substructure search algorithm specific to molecular graphs and an exact cover procedure to elucidate structural
ambiguities, have been added by us to address specific problems encountered in group contribution methods.
At present, most of the popular published group contribution methods can be automated using MOSDAP as
a general engine for converting formula line notation (e.g., SMILES strings) into corresponding sets of
functional groups and/or features.