Early identification of autism, followed by appropriate intervention, has the potential to improve outcomes for autistic individuals. Numerous screening instruments have been developed for children under 3 years of age. Level 1 screeners are used in large-scale screening to detect at-risk children in the general population; Level 2 screeners are concerned with distinguishing children with signs of autism from those with other developmental problems. The focus here is evaluation of Level 2 screeners. However, given the contributions of Level 1 screeners and the necessity to understand how they might interface with Level 2 screeners, we briefly review Level 1 screeners and consider instrument characteristics and system variables that may constrain their effectiveness. The examination of Level 2 screeners focuses on five instruments associated with published evaluations in peer-reviewed journals. Key criteria encompass the traditional indices of test integrity such as test reliability (inter-rater, test-retest) and construct validity, including concurrent and predictive validity, sensitivity (SE), and specificity (SP). These evaluations reveal limitations, including inadequate sample sizes, reliability issues, and limited involvement of independent researchers. Also lacking are comparative test evaluations under standardized conditions, hindering interpretation of differences in discriminative performance across instruments. Practical considerations constraining the use of such instruments—such as the requirements for training in test administration and test administration time—are canvassed. Published Level 2 screener short forms are reviewed and, as a consequence of that evaluation, future directions for assessing the discriminative capacity of items and measures are suggested. Suggested priorities for future research include targeting large and diverse samples to permit robust appraisals of Level 2 items and scales across the 12–36 month age range, a greater focus on precise operationalization of items and response coding to enhance reliability, ongoing exploration of potentially discriminating items at the younger end of the targeted age range, and trying to unravel the complexities of developmental trajectories in autistic infants. Finally, we emphasize the importance of understanding how screening efficacy is dependent on clinicians' and researchers' ability not only to develop screening tests but also to negotiate the complex organizational systems within which screening procedures must be implemented.