Textual Entailment (TE) is the task of recognizing entailment, paraphrase, and contradiction relations between a given text pair. The goal of textual entailment research is to develop a core inference component that can be applied to various domains such as QA.We observed several rank correlations on the data and system results in the NTCIR-10 RITE-2 task, trying to find out correlations between datasets and evaluation metrics. We also constructed RITE4QA datasets in the RITE-2 task under the scenario of QA in order to see the applicability of RITE systems in QA.We find that datasets created from different sources and different ways can hardly predict each other. However, the system ranking on the dataset consisting of expert-made artificial pairs has moderate correlation with the ranking on QA metrics. Both RITE metrics and QA metrics are stable in terms of their own subtasks.