Fault identification in Photovoltaic (PV) array is a contemporary research topic motivated by the higher penetration levels of PV systems in recent electrical grids. Therefore, this work aims to define an optimal Machine learning (ML) structure of automatic detection and diagnosis algorithm for common PV array faults, namely, permanent (Arc Fault, Line-to-Line, Maximum Power Point Tracking unit failure, and Open-Circuit faults), and temporary (Shading) under a wide range of climate datasets, fault impedances, and shading scenarios. To achieve the best-fit ML structure, three distinct ML classifiers are compared, namely, Decision Tree (DT) based on different splitting criteria, K-Nearest Neighbors (KNN) based on the different metrics of distance and weighting functions, and Support Vector Machine (SVM) based on different Kernel functions and multi-classification approaches. Also, Bayesian Optimization is adopted to assign the optimal hyperparameters to the fault classifiers. To investigate the performance of classifiers reported, both simulation and experimental case studies are carried out and presented.INDEX TERMS Photovoltaic array faults, machine learning, decision tree, nearest neighbors, support vector machine, Bayesian optimization.