Since speech is a continuous stream with no systematic boundaries between words, how do pre-verbal infants manage to discover words? A proposed solution is that they might use the transitional probability between adjacent syllables, which drops at word boundaries. Here, we tested the limits of this mechanism by increasing the size of the word-unit to 4 syllables, and its automaticity by testing asleep neonates. Using markers of statistical learning in neonates' EEG, compared to adult' behavioral performances in the same task, we confirmed that statistical learning is automatic enough to be efficient even in sleeping neonates. But we also revealed that : 1) Successfully tracking transition probabilities in a sequence is not sufficient to segment it 2) Prosodic cues, as subtle as subliminal pauses, enable to recover segmenting capacities 3) Adults' and neonates' capacities are remarkably similar despite the difference of maturation and expertise. Finally, we observed that learning increased the similarity of neural responses across infants, providing a new neural marker to monitor learning. Thus, from birth, infants are equipped with adult-like tools, allowing to extract small coherent word-like units within auditory streams, based on the combination of statistical analyses and prosodic cues.