Filtering the passive scalar transport equation in the large-eddy simulation (LES) of turbulent transport gives rise to the closure term corresponding to the unresolved scalar flux. Understanding and respecting the statistical features of subgrid-scale (SGS) flux is a crucial point in robustness and predictability of the LES. In this work, we investigate the intrinsic nonlocal behavior of the SGS passive scalar flux through studying its two-point statistics obtained from the filtered direct numerical simulation (DNS) data for passive scalar transport in homogeneous isotropic turbulence (HIT). Presence of long-range correlations in true SGS scalar flux urges to go beyond the conventional local closure modeling approaches that fail to predict the non-Gaussian statistical features of turbulent transport in passive scalars. Here, we propose an appropriate statistical model for microscopic SGS motions by taking into account the filtered Boltzmann transport equation (FBTE) for passive scalar. In FBTE, we approximate the filtered equilibrium distribution with an α-stable Lèvy distribution that essentially incorporates a power-law behavior to resemble the observed nonlocal statistics of SGS scalar flux. Generic ensemble-averaging of such FBTE lets us formulate a continuum level closure model for the SGS scalar fluxappearing in terms of fractional-order Laplacian that is inherently nonlocal. Through a data-driven approach, we infer the optimal version of our SGS model using the high-fidelity data for the two-point correlation function between the SGS scalar flux and filtered scalar gradient, and sparse linear regression. In an a priori test, the optimal fractional-order model yields a promising performance in reproducing the probability distribution function (PDF) of the SGS dissipation of the filtered scalar variance compared to its true PDF obtained from the filtered DNS data.