P-wave first-motion polarity is the most useful information in determining the focal mechanisms of earthquakes, particularly for smaller earthquakes. Algorithms have been developed to automatically determine P-wave first-motion polarity, but the performance level of the conventional algorithms remains lower than that of human experts. In this study, we develop a model of the convolutional neural networks (CNNs) to determine the P-wave first-motion polarity of observed seismic waveforms under the condition that P-wave arrival times determined by human experts are known in advance. In training and testing the CNN model, we use about 130 thousand 250 Hz and about 40 thousand 100 Hz waveform data observed in the San-in and the northern Kinki regions, western Japan, where three to four times larger number of waveform data were obtained in the former region than in the latter. First, we train the CNN models using 250 Hz and 100 Hz waveform data, respectively, from both regions. The accuracies of the CNN models are 97.9% for the 250 Hz data and 95.4% for the 100 Hz data. Next, to examine the regional dependence, we divide the waveform data sets according to the observation region, and then we train new CNN models with the data from one region and test them using the data from the other region. We find that the accuracy is generally high (95%) and the regional dependence is within about 2%. This suggests that there is almost no need to retrain the CNN model by regions. We also find that the accuracy is significantly lower when the number of training data is less than 10 thousand, and that the performance of the CNN models is a few percentage points higher when using 250 Hz data compared to 100 Hz data. Distribution maps, on which polarities determined by human experts and the CNN models are plotted, suggest that the performance of the CNN models is better than that of human experts.