Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
In the paper, we consider the methods of exact approximations of statistics probabilities distribution. As the exact approximations, we consider ∆-exact distributions. The difference between the ∆-exact distributions and the exact approximations does not exceed a predefined arbitrary small value ∆ that defines the accuracy of the approximations. Besides, we consider the methods of the first and second multiplicity, which use statistic characteristics of samples. The first multiplicity method is based on the properties of the components of the first multiplicity vector, which are nonnegative integer solutions of a linear equation. The linear equation relates the alphabet sign frequency and the sample size. The second multiplicity method is based on the solution of a system of linear equations. The linear equations of the system relate the sample size and the alphabet cardinality with the number of the alphabet signs that have equal frequency in the sample. For the considered methods of exact approximations, we give expressions to estimate the computational complexity of exact approximations of distributions for any sample parameters. To provide the approximations accuracy of 10–5, and the computing resource with the performance of 1018 operations per second, we calculated the sample parameters. For these samples, we can calculate the exact approximations of distributions, using the considered methods, the available computing resource, and the declared accuracy. We formed the parameter regions for the samples, and the exact approximations of distributions can be calculated for these samples with the help of various methods. We compared the regions themselves and with the so-called region of uncertainty, which is limited from above not more than 5-fold excess of the sample size over the alphabet cardinality. On the base of the comparison of the parameter regions of the samples, which are suitable for calculation of the exact approximations of the distributions, we compared their calculation methods. It is shown that owing to the second multiplicity method, we can make calculations for all values of the alphabet cardinality from 2 to 256. In contrast to the second multiplicity method, the first multiplicity method does not allow calculations for the alphabet cardinality over 73. The parameter region of the samples, which are suitable for calculation of the limit approximations of the distributions by the second multiplicity method, contains the complete parameter region of the samples, suitable for calculation of the limit approximations of the distributions by the first multiplicity method, and exceeds it more than in 52 times. Owing to the comparison of the methods of exact approximations, it is proved that if we have the same computing resource, we can calculate the exact approximations with the help of the second multiplicity method for a greater number of samples with the increased parameters in comparison with the first multiplicity method. Hence, to calculate the exact approximations of statistics probability distributions, we choose the second multiplicity method. Practical significance of the research is possibility of calculation of the maximal values of the sample parameters. The current technological level of computer systems allows calculation of the exact approximations of the distributions for these values, which provide the minimal loss of criteria efficiency in comparison with the limit approximations used for the sample parameters. The scientific novelty of the research is the comparative analysis of the methods of exact approximations of distributions for calculation of distributions for the sample parameters, which do not allow calculation of the exact distributions due to their high computational complexity.
In the paper, we consider the methods of exact approximations of statistics probabilities distribution. As the exact approximations, we consider ∆-exact distributions. The difference between the ∆-exact distributions and the exact approximations does not exceed a predefined arbitrary small value ∆ that defines the accuracy of the approximations. Besides, we consider the methods of the first and second multiplicity, which use statistic characteristics of samples. The first multiplicity method is based on the properties of the components of the first multiplicity vector, which are nonnegative integer solutions of a linear equation. The linear equation relates the alphabet sign frequency and the sample size. The second multiplicity method is based on the solution of a system of linear equations. The linear equations of the system relate the sample size and the alphabet cardinality with the number of the alphabet signs that have equal frequency in the sample. For the considered methods of exact approximations, we give expressions to estimate the computational complexity of exact approximations of distributions for any sample parameters. To provide the approximations accuracy of 10–5, and the computing resource with the performance of 1018 operations per second, we calculated the sample parameters. For these samples, we can calculate the exact approximations of distributions, using the considered methods, the available computing resource, and the declared accuracy. We formed the parameter regions for the samples, and the exact approximations of distributions can be calculated for these samples with the help of various methods. We compared the regions themselves and with the so-called region of uncertainty, which is limited from above not more than 5-fold excess of the sample size over the alphabet cardinality. On the base of the comparison of the parameter regions of the samples, which are suitable for calculation of the exact approximations of the distributions, we compared their calculation methods. It is shown that owing to the second multiplicity method, we can make calculations for all values of the alphabet cardinality from 2 to 256. In contrast to the second multiplicity method, the first multiplicity method does not allow calculations for the alphabet cardinality over 73. The parameter region of the samples, which are suitable for calculation of the limit approximations of the distributions by the second multiplicity method, contains the complete parameter region of the samples, suitable for calculation of the limit approximations of the distributions by the first multiplicity method, and exceeds it more than in 52 times. Owing to the comparison of the methods of exact approximations, it is proved that if we have the same computing resource, we can calculate the exact approximations with the help of the second multiplicity method for a greater number of samples with the increased parameters in comparison with the first multiplicity method. Hence, to calculate the exact approximations of statistics probability distributions, we choose the second multiplicity method. Practical significance of the research is possibility of calculation of the maximal values of the sample parameters. The current technological level of computer systems allows calculation of the exact approximations of the distributions for these values, which provide the minimal loss of criteria efficiency in comparison with the limit approximations used for the sample parameters. The scientific novelty of the research is the comparative analysis of the methods of exact approximations of distributions for calculation of distributions for the sample parameters, which do not allow calculation of the exact distributions due to their high computational complexity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.