Characterizing the equilibrium ensemble of folding pathways, including their relative probability, is one of the major challenges in protein folding theory today. Although this information is in principle accessible via all-atom molecular dynamics simulations, it is difficult to compute in practice because protein folding is a rare event and the affordable simulation length is typically not sufficient to observe an appreciable number of folding events, unless very simplified protein models are used. Here we present an approach that allows for the reconstruction of the full ensemble of folding pathways from simulations that are much shorter than the folding time. This approach can be applied to all-atom protein simulations in explicit solvent. It does not use a predefined reaction coordinate but is based on partitioning the state space into small conformational states and constructing a Markov model between them. A theory is presented that allows for the extraction of the full ensemble of transition pathways from the unfolded to the folded configurations. The approach is applied to the folding of a PinWW domain in explicit solvent where the folding time is two orders of magnitude larger than the length of individual simulations. The results are in good agreement with kinetic experimental data and give detailed insights about the nature of the folding process which is shown to be surprisingly complex and parallel. The analysis reveals the existence of misfolded trap states outside the network of efficient folding intermediates that significantly reduce the folding speed. D iscovering the mechanism by which proteins fold into their native 3D structure remains an intriguing problem (1, 2). Essential questions are: How does an ensemble of denatured molecules find the same native structure, starting from different conformations? Are there particular sequences in which the structural elements of a protein are formed (3-5)? Are there multiple parallel routes by which protein structure formation can proceed (6, 7)?Full answers to these questions require one to characterize the ensemble of folding pathways, including their relative probabilities. In principle, this detailed information is accessible via molecular dynamics (MD) simulations which, when used in concert with experimental evidence, are becoming an increasingly accepted tool to understanding structural details that are not easily accessible via the experimental observables (8). MD simulations with atomistic models of proteins have been used to study the dynamics of small proteins with folding times in the microsecond range (9-13). However, even though MD simulations make the full spatiotemporal detail accessible to observation, the characterization of the pathway ensemble is computationally difficult: A brute-force approach would start simulations from an equilibrium of unfolded structures, say A, and simulate until they relax into a set of folded state B. The analysis would then only be comprised of those trajectory segments that leave A and relax to B without...