This study tested the assumption that the Bayley Scales of Infant and Toddler Development, Fourth Edition (Bayley-4) functions similarly for boys and girls and for four age groups. The Bayley-4 American norming sample of 1,700 children ages 0–42 months (3.5 years) was used, which included 50% boys and girls. Fifty-three percent of the children identified as White, 22.1% as Hispanic, 12.5% as Black, 8.5% as other, and 4.0% as Asian. A confirmatory factor analysis demonstrated the three-factor structure of cognitive, language, and motor abilities fit the data well (comparative fit index = .99, root-mean-square of error of approximation = .08, standardized root-mean-square residual = .02) and fit significantly better than the two- and one-factor models. The correlations between the latent factors were moderate (r = .73) to large sized (r = .81). Measurement and structural invariance were tested for boys and girls and four age groups (0–5, 6–13, 14–25, and 26–42 months). Residual invariance was supported for girls and boys, and intercept invariance was supported for the four age groups. The measurement invariance results suggest the Bayley-4 is not biased toward these gender and age groups, and group comparisons and decision making can be made with the Bayley-4 scores. Structural invariance findings suggested some differences for gender and age groups. The relations between the cognitive, language, and motor factors and factor variances were equal across girls and boys but differed significantly across the four age groups. Girls scored significantly higher on the three latent means, but these differences were small to negligible.