1979
DOI: 10.1021/ac50039a023
|View full text |Cite
|
Sign up to set email alerts
|

Influence of errors and matching criteria upon the retrieval of binary coded low resolution mass spectra

Abstract: The influence of coding errors and the matching criterion being used upon the performance of a retrieval system for binary coded spectra has been investigated. A file of 9628 mass spectra, including 773 doublets, originating from an MSDC collection, was used. The spectra were reduced to 120 binary coded m/e values, selected by using the information content as a criterion. It is concluded that the performance of the retrieval primarily depends on the extent of errors occurring in the coded spectra and is hardly… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
10
0

Year Published

1980
1980
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 18 publications
(10 citation statements)
references
References 15 publications
0
10
0
Order By: Relevance
“…One aspect of spectral comparison algorithms that is often overlooked is the assumption, and oftentimes mathematical requirement, that any variance in the relative abundance of peaks within a replicate spectrum is randomly or independently variable at each m / z value. , Such a mathematical requirement has been assumed since the first use of computational approaches to background-subtraction or spectral deconvolution into discrete component spectra, ,, whether using simultaneous linear equations or matrix theory. , By default, deconvolution algorithms explicitly assume unit correlation among the absolute signals of fragments as a function of time, or scan number, and they implicitly assume that any unexplained variance in a given scan at a specific m / z value is random. ,,,,,, Furthermore, to have statistical validity, most measures of spectral similarity and dissimilarity between questioned and reference spectra also require independent variance, i.e., no correlation, in the relative abundance at each m / z value within replicate spectra. ,, As an example of this reliance, a recent and extremely effective approach to spectral comparisons uses combined unequal variance t -tests at each m / z value to compare questioned and known spectra. Combining the results of independent t -tests explicitly requires independent variability of each t -test to enable the computation of random match probabilities. However, as indicated elsewhere, and as we will show below, replicate spectra still contain strong correlations in the normalized abundances of peaks, so the different m / z values are not independently variable.…”
Section: The Random and Nonrandom Variance Of Replicate Spectramentioning
confidence: 99%
“…One aspect of spectral comparison algorithms that is often overlooked is the assumption, and oftentimes mathematical requirement, that any variance in the relative abundance of peaks within a replicate spectrum is randomly or independently variable at each m / z value. , Such a mathematical requirement has been assumed since the first use of computational approaches to background-subtraction or spectral deconvolution into discrete component spectra, ,, whether using simultaneous linear equations or matrix theory. , By default, deconvolution algorithms explicitly assume unit correlation among the absolute signals of fragments as a function of time, or scan number, and they implicitly assume that any unexplained variance in a given scan at a specific m / z value is random. ,,,,,, Furthermore, to have statistical validity, most measures of spectral similarity and dissimilarity between questioned and reference spectra also require independent variance, i.e., no correlation, in the relative abundance at each m / z value within replicate spectra. ,, As an example of this reliance, a recent and extremely effective approach to spectral comparisons uses combined unequal variance t -tests at each m / z value to compare questioned and known spectra. Combining the results of independent t -tests explicitly requires independent variability of each t -test to enable the computation of random match probabilities. However, as indicated elsewhere, and as we will show below, replicate spectra still contain strong correlations in the normalized abundances of peaks, so the different m / z values are not independently variable.…”
Section: The Random and Nonrandom Variance Of Replicate Spectramentioning
confidence: 99%
“…The numbers of compounds used in the training sets for the classes are shown in Table II. Cross validation was used to determine the number of statistically -significant components for 46 1,1-dichloroethene 8 l-chloro-2-methylbenzene 47 trichloroethene 9 l-chloro-4-methylbenzene 48 tetrachloroethene 10 p-chlorostyrene 49 bromoethane 11 1,1-dichloroethane 50 1,2-dibromoethane 12 1,1,1,2-tetrachloroethane 51 1-chloropropane 13 1,2,3-trichloropropane 52 2-chloropropane 14 3-chloropropene 53 1,2-dichloropropane 15 2-chlorobutane 54 1,3-dichloropropane 16 1,3-dichlorobutane 55 l-bromo-3-chloropropane 17 1,4-dichlorobutane 56 1,2-dibromopropane 18 1,4-dichloro-2-butene (cis) 57 2,3-dichlorobutane 19 3,4-dichlorobutene 58 tetrahydrofuran 20 1,4-dioxane 59 benzaldehyde 21 l-chloro-2,3-epoxypropane 60 1-bromo-l-chloroethane 22 "Numbers in parentheses are numbers of compounds used in training sets. 6Alkaenes includes alkanes and alkenes. "…”
Section: Theoretical Backgroundmentioning
confidence: 99%
“…While Eckschlager is primarily concerned with the mathematical agspects of information theory (04) and how it can express the "information efficiency" of analytical methods ( 02), Dijkstra's interest in applying information theory is to improve analytical methods. For example, the coding of spectral data in mass spectrometry (07,08) and infrared spectrometry (03, 05) is significantly improved with the aid of information theory. Also, in the area of chromatography, the information content of TLC identification procedures can be determined (01).…”
Section: Optimizationmentioning
confidence: 99%