Pretrained contextualized language models such as BERT have achieved impressive results on various natural language processing benchmarks. Benefiting from multiple pretraining tasks and large scale training corpora, pretrained models can capture complex syntactic word relations. In this paper, we use the deep contextualized language model BERT for the task of ad hoc table retrieval. We investigate how to encode table content considering the table structure and input length limit of BERT. We also propose an approach that incorporates features from prior literature on table retrieval and jointly trains them with BERT. In experiments on public datasets, we show that our best approach can outperform the previous state-of-the-art method and BERT baselines with a large margin under different evaluation metrics.
CCS CONCEPTS• Information systems → Content analysis and feature selection; Retrieval models and ranking; • Computing methodologies → Search methodologies.
The standardized generalized dimensionality discrepancy measure and the standardized model-based covariance are introduced as tools to critique dimensionality assumptions in multidimensional item response models. These tools are grounded in a covariance theory perspective and associated connections between dimensionality and local independence. Relative to their precursors, they allow for dimensionality assessment in a more readily interpretable metric of correlations. A simulation study demonstrates the utility of the discrepancy measures' application at multiple levels of dimensionality analysis, and compares them to factor analytic and item response theoretic approaches. An example illustrates their use in practice.Approaches to dimensionality assessment based on item response theory (IRT) have typically been framed as either confirmatory approaches to evaluate unidimensionality or exploratory approaches that seek to determine the multidimensional structure. Such procedures are well suited to situations where assessment purposes and design imply the use of a unidimensional model. However, this strategy leaves something to be desired in situations where substantive theory, prior research, and assessment design dictate that performance depends on multiple aspects of proficiency and the analyst wishes to employ a multidimensional IRT (MIRT) model. This paper focuses on this situation, namely, where the analyst has specified the model, including the number of latent variables and the pattern of dependence of the item responses on those latent variables, a priori (e.g.Drawing from terminology in factor analysis, we refer to such situations as confirmatory MIRT modeling. Recent advances in procedures in fit for IRT broadly conceptualized (Maydeu-Olivares, 2013) have promise for applications in MIRT. The current work focuses on one such aspect of examining data-model fit, namely, in terms of the assumed dimensionality of the model.To make this more concrete, consider the multidimensional normal-ogive IRT model (e.g., McDonald, 1999) for dichotomous observables (i.e., scored item 144 A Standardized Generalized Dimensionality Discrepancy Measure responses) which specifies the probability of an observed value of 1 (corresponding to a particular response) as P(X i j = 1|θ i , a j , d j ) = F(a j θ i + d j ),(1)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.