In this paper we define and examine the power of the conditional-sampling oracle in the context of distribution-property testing. The conditional-sampling oracle for a discrete distribution µ takes as input a subset S ⊂ [n] of the domain, and outputs a random sample i ∈ S drawn according to µ, conditioned on S (and independently of all prior samples). The conditional-sampling oracle is a natural generalization of the ordinary sampling oracle, in which S always equals [n].We show that with the conditional-sampling oracle, testing uniformity, testing identity to a known distribution, and testing any label-invariant property of distributions is easier than with the ordinary sampling oracle. On the other hand, we also show that for some distribution properties the sample-complexity remains near-maximal even with conditional sampling.tested with a constant number of samples (compared to Θ( √ n) unconditional samples even for uniformity [10,3]). The most general result of this paper (Section 6) is that any label-invariant property of distributions (a symmetric property in the terminology of [15]) can be tested using poly(log n) conditional samples. 1 On the other hand, there are properties for which testing remains almost as hard as possible even with conditional samples: We show a property of distributions that requires at least Ω(n) conditional samples to test (Section 8).Another feature that makes conditional-samples interesting is that in contrast to the testers using ordinary samples, which are non-adaptive by definition, adaptivity (and the algorithmic aspect of testing) in conditional-sampling model plays an important role. For instance, the aforementioned task of testing uniformity, while still possible with a much better sampling complexity than in the traditional model, cannot be done non-adaptively with a constant number of samples (see Section 7.2).Before we move to some motivating examples, let us address the concern of whether arbitrary conditioning is realistic: While the examples below do relate to arbitrary conditioning, sometimes one would like the conditioning to be more restricted, in some sense describable by fewer than the n bits required to describe the conditioning set S. In fact, many of our algorithms require less than that. For example, the adaptive uniformity test takes only unconditional samples and samples conditioned on a constant size set, so the description size per sample is in fact O(log n), as there are n O(1) possibilities. The adaptive general label invariant property tester takes only samples conditioned to dyadic intervals of [n], so here the description size is O(log n) as well. The non-adaptive tests do require general conditioning, as they pick uniformly random sets of prescribed sizes.
For a property P and a sub-property P , we say that P is Ppartially testable with q queries if there exists an algorithm that distinguishes, with high probability, inputs in P from inputs -far from P , using q queries. Some natural properties require many queries to test, but can be partitioned into a small number of subsets for which they are partially testable with very few queries, sometimes even a number independent of the input size.For properties over 0, 1, the notion of being thus partitionable ties in closely with Merlin-Arthur proofs of Proximity (MAPs) as defined independently in [14]; a partition into r partially-testable properties is the same as a Merlin-Arthur system where the proof consists of the identity of one of the r partially-testable properties, giving a 2-way translation to an O(log r) size proof.Our main result is that for some low complexity properties a partition as above cannot exist, and moreover that for each of our properties there does not exist even a single subproperty featuring both a large size and a query-efficient partial test, in particular improving the lower bound set in [14]. For this we use neither the traditional Yao-type arguments nor the more recent communication complexity method, but open up a new approach for proving lower bounds.First, we use entropy analysis, which allows us to apply our arguments directly to 2-sided tests, thus avoiding the cost of the conversion in [14] from 2-sided to 1-sided tests. Broadly speaking we use "distinguishing instances" of a supposed test to show that a uniformly random choice of a member of the sub-property has "low entropy areas", ultimately leading to it having a low total entropy and hence having a small base set.Additionally, to have our arguments apply to adaptive tests, we use a mechanism of "rearranging" the input bits (through a decision tree that adaptively reads the entire input) to expose the low entropy that would otherwise not be apparent.We also explore the possibility of a connection in the other direction, namely whether the existence of a good partition (or MAP) can lead to a relatively query-efficient standard property test. We provide some preliminary results concerning this question, including a simple lower bound on the possible trade-off.Our second major result is a positive trade-off result for the restricted framework of 1-sided proximity oblivious tests. This is achieved through the construction of a "universal tester" that works the same for all properties admitting the restricted test. Our tester is very related to the notion of sample-based testing (for a non-constant number of queries) as defined by Goldreich and Ron in [13]. In particular it partially resolves an open problem raised by [13].
In this paper we define and examine the power of the conditional sampling oracle in the context of distribution-property testing. The conditional sampling oracle for a discrete distribution µ takes as input a subset S ⊂ [n] of the domain, and outputs a random sample i ∈ S drawn according to µ, conditioned on S (and independently of all prior samples). The conditional-sampling oracle is a natural generalization of the ordinary sampling oracle in which S always equals [n].We show that with the conditional-sampling oracle, testing uniformity, testing identity to a known distribution, and testing any label-invariant property of distributions is easier than with the ordinary sampling oracle. On the other hand, we also show that for some distribution properties the sample complexity remains near-maximal even with conditional sampling.
We study the query complexity of testing for properties defined by read once formulas, as instances of massively parametrized properties, and prove several testability and non-testability results. First we prove the testability of any property accepted by a Boolean read-once formula involving any bounded arity gates, with a number of queries exponential in ǫ, doubly exponential in the arity, and independent of all other parameters. When the gates are limited to being monotone, we prove that there is an estimation algorithm, that outputs an approximation of the distance of the input from satisfying the property. For formulas only involving And/Or gates, we provide a more efficient test whose query complexity is only quasipolynomial in ǫ. On the other hand, we show that such testability results do not hold in general for formulas over non-Boolean alphabets; specifically we construct a property defined by a read-once arity 2 (non-Boolean) formula over an alphabet of size 4, such that any 1/4-test for it requires a number of queries depending on the formula size. We also present such a formula over an alphabet of size 5 that additionally satisfies a strong monotonicity condition.
We study the query complexity of testing for properties defined by read once formulas, as instances of massively parametrized properties, and prove several testability and non-testability results. First we prove the testability of any property accepted by a Boolean read-once formula involving any bounded arity gates, with a number of queries exponential in , doubly exponential in the arity, and independent of all other parameters. When the gates are limited to being monotone, we prove that there is an estimation algorithm, that outputs an approximation of the distance of the input from satisfying the property. For formulas only involving And/Or gates, we provide a more efficient test whose query complexity is only quasipolynomial in . On the other hand, we show that such testability results do not hold in general for formulas over non-Boolean alphabets; specifically we construct a property defined by a read-once arity 2 (non-Boolean) formula over an alphabet of size 4, such that any 1/4-test for it requires a number of queries depending on the formula size. We also present such a formula over an alphabet of size 5 that additionally satisfies a strong monotonicity condition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.