The presence and evolution of defects that appear in the manufacturing process play a vital role in the failure mechanisms of engineering materials. In particular, the collective behavior of dislocation dynamics at the mesoscale leads to avalanche, strain bursts, intermittent energy spikes, and nonlocal interactions producing anomalous features across different time-and length-scales, directly affecting plasticity, void and crack nucleation. Discrete Dislocation Dynamics (DDD) simulations are often used at the meso-level, but the cost and complexity increase dramatically with simulation time. To further understand how the anomalous features propagate to the continuum, we develop a probabilistic model for dislocation motion constructed from the position statistics obtained from DDD simulations. We obtain the continuous limit of discrete dislocation dynamics through a Probability Density Function for the dislocation motion, and propose a nonlocal transport model for the PDF. We develop a machine-learning framework to learn the parameters of the nonlocal operator with a power-law kernel, connecting the anomalous nature of DDD to the origin of its corresponding nonlocal operator at the continuum, facilitating the integration of dislocation dynamics into multi-scale, long-time material failure simulations.