In order to study transport in complex environments, it is extremely important to determine the physical mechanism underlying diffusion and precisely characterize its nature and parameters. Often, this task is strongly impacted by data consisting of trajectories with short length (either due to brief recordings or previous trajectory segmentation) and limited localization precision. In this paper, we propose a machine learning method based on a random forest architecture, which is able to associate single trajectories to the underlying diffusion mechanism with high accuracy. In addition, the algorithm is able to determine the anomalous exponent with a small error, thus inherently providing a classification of the motion as normal or anomalous (subor super-diffusion). The method provides highly accurate outputs even when working with very short trajectories and in the presence of experimental noise. We further demonstrate the application of transfer learning to experimental and simulated data not included in the training/test dataset. This allows for a full, high-accuracy characterization of experimental trajectories without the need of any prior information.In the last decades, the research in biophysics has conveyed large efforts toward the development of experimental techniques allowing the visualization of biological processes one molecule at a time [1-4]. These efforts have been mainly driven by the concept that ensemble-averaging hides important features that are relevant for cellular function. Somehow expectedly, experiments performed by means of these techniques have shown a large heterogeneity in the behavior of biological molecules, thus fully justifying the use of these raffinate tools.Besides, experiments performed using single particle tracking [3] have revealed that even chemically-identical molecules in biological media can display very different behaviors, as a consequence of the complex environment where diffusion takes place. By way of example, this heterogeneity is reflected in the broad distribution of dynamic parameters of distinct individual trajectories corresponding to the same molecular species, such as the diffusion coefficient, well above stochastic indetermination. Typically, the trajectories are analyzed by quantifying the (time-averaged) mean square displacement (tMSD) as a function of the time lag τ [5]:The calculation of this quantity-expected to scale linearly for a Brownian walker in a homogeneous environment-has provided a ubiquitous evidence of anomalous behaviors in biological systems, characterized by an asymptotic nonlinear scaling of the tMSD curve d t a
2. More experiments have shown that the anomalous exponent can vary from particle to particle ( figure 1(a)) as a consequence of molecular interactions and that these changes can be experienced by the same particle in space/time [6]. Several methods have been OPEN ACCESS RECEIVED