Interest drives our focus of attention and plays an important role in social communication. Given its relevance for many activities (e.g., learning, entertainment) a system able to automatically detect someone's interest has several potential applications. In this paper, we analyze the physiological and behavioral patterns associated with visual interest and present a method for the automatic recognition of interest, curiosity and their most relevant appraisals, namely, coping potential, novelty and complexity. We conducted an experiment in which participants watched images and micro-videos while multimodal signals were recorded-facial expressions, galvanic skin response (GSR), and eye gaze. After watching each stimulus, participants self-reported their level of interest, curiosity, coping potential, perceived novelty, and complexity. Results showed that interest was associated with other facial Action Units than smiling when dynamics was taken into consideration, especially inner brow raiser and eye lid tightener. Longer saccades were also present when participants watched interesting stimuli. However, correlations of appraisals with specific facial Action Units and eye gaze were in general stronger than those we found for interest. We trained Random Forests regression models to detect the level of interest, curiosity, and appraisals from multimodal features. The recognition models-unimodal and multimodal-for appraisals generally outperformed those for interest, in particular for static images. In summary, our study suggests that automatic appraisal detection may be a suitable way to detect subtle emotions like interest for which prototypical expressions do not exist.