Extracting statistical regularities from the environment is a primary learning mechanism, which might support language acquisition. While it is known that infants are sensitive to transition probabilities between syllables in continuous speech, the format of the encoded representation remains unknown. Here we used electrophysiology to investigate how 31 full-term neonates process an artificial language build by the random concatenation of four pseudo-words and which information they retain. We used neural entrainment as a marker of the regularities the brain is tracking in the stream during learning. Then, we compared the evoked-related potentials (ERP) to different triplets to further explore the format of the information kept in memory. After only two minutes of familiarization with the artificial language, we observed significant neural entrainment at the word rate over left temporal electrodes compared to a random stream, demonstrating that sleeping neonates automatically and rapidly extracted the word pattern. ERPs significantly differed between triplets starting or not with the correct first syllable in the test phase, but no difference was associated with later violations in transition probabilities, revealing a change in the representation format between segmentation and memory processes. If the transition probabilities were used to segment the stream, the retained representation relied on syllables' ordinal position, but still without a complete representation of the words at this age. Our results revealed a two-step learning strategy, probably involving different brain regions.