Cross-band facial recognition is a difficult task, even for the most robust matching algorithms. Inherent factors such as camera effects (blur, noise, and sampling), and variation in pose and illumination, are known to negatively affect algorithm performance. Because cross-band matching algorithms are in the infancy of development, it is currently unclear if their performance is superior to human observers performing this task. In this paper, we present findings from a pilot study aimed at analyzing the ability of an ensemble of human observers to perform the 1:N cross-band facial identification task on degraded facial images, where the probe and gallery images were captured in different spectral bands (visible, SWIR, MWIR and LWIR). Results from our 11-alternative forced choice perception study indicate that: 1) a group of observers familiar with even a subset of subjects in a gallery set are, on average, able to perform the task with higher probability (p > 0.15) than a group of observers with no prior exposure, and 2) task performance for both the familiar and unfamiliar groups increased 1.5-3.4% when matching multi-spectral probe images to galleries of 24-bit color facial images vs. 8-bit monochrome facial images. For the SWIR case, however, we observed a 9.1% increase in performance with 24-bit facial images vs. 8-bit facial images. Results from this study can be leveraged for future work directly comparing cross-band matching performance of humans vs. algorithms.