Background: Alcohol consumption during pregnancy, even at low doses, may damage the fetus. Pregnant women tend to underreport their alcohol consumption generating the need for sensitive and specific biomarkers, among which PEth has emerged due to its high specificity and possibility to be measured in both maternal and neonatal blood. The aim of this study is to systematically review the latest 20 years of literature for depicting the state of the art, the limitations, and the prospects of PEth for estimating alcohol consumption during pregnancy. Materials and methods: A systematic search, adhering to PRISMA guidelines, of the latest 20 years of literature through “MeSH” and “free-text” protocols in the databases PubMed, SCOPUS, and Web of Science, with time limits 1 January 2002–1 March 2022, was performed. The inclusion criteria were as follows: PEth used for detecting alcohol consumption during pregnancy, quantified in blood through liquid chromatography coupled to mass spectrometry, and full texts in the English language. Opinion papers, editorials, and narrative reviews were excluded. Results: Sixteen (16) papers were included in the present review (0.81% of total retrieved records). All the included records were original articles, of which there were seven prospective cohort/longitudinal studies, six cross-sectional studies, two observational-descriptive studies, and one retrospective study. All studies assayed PEth in at least one biological matrix; seven (7) studies quantified PEth in maternal blood, seven studies in newborn blood, and only two studies in both maternal and neonatal blood. In several included papers, PEth proved more sensitive than self-reports for identifying pregnant women with an active alcohol intake with the diagnostic efficiency improving with the increase of the maternal alcohol intake. Conclusions: Further studies, performed on wider and well-stratified populations, are needed to drive any definitive conclusion. PEth is a promising marker for monitoring alcohol use in pregnancy; however, at the present time, its use is still limited mainly by the absence of a globally agreed interpretative cut-off, the paucity of data regarding its specificity/sensitivity, and the lack of standardization on the diagnostic efficiency of the different isoforms.