Embedded system testing, especially long-term reliability testing, of flash memory solutions such as embedded multi-media card, secure digital card and solid-state drive involves strategic decision making related to test sample size to achieve high test coverage. The test sample size is the number of flash memory devices used in a test. Earlier, there were physical limitations on the testing period and the number of test devices that could be used. Hence, decisions regarding the sample size depended on the experience of human testers owing to the absence of well-defined standards. Moreover, a lack of understanding of the importance of the sample size resulted in field defects due to unexpected user scenarios. In worst cases, users finally detected these defects after several years. In this paper, we propose that a large number of potential field defects can be detected if an adequately large test sample size is used to target weak features during long-term reliability testing of flash memory solutions. In general, a larger test sample size yields better results. However, owing to the limited availability of physical resources, there is a limit on the test sample size that can be used. In this paper, we address this problem by proposing a self-adaptive reliability testing scheme to decide the sample size for effective long-term reliability testing.