Emotions are multimodal processes that play a crucial role in our everyday lives. Recognizing emotions is becoming more critical in a wide range of application domains such as healthcare, education, human-computer interaction, Virtual Reality, intelligent agents, entertainment, and more. Facial macro-expressions or intense facial expressions are the most common modalities in recognizing emotional states. However, since facial expressions can be voluntarily controlled, they may not accurately represent emotional states. Earlier studies have shown that facial micro-expressions are more reliable than facial macro-expressions for revealing emotions. They are subtle, involuntary movements responding to external stimuli that cannot be controlled. This paper proposes using facial micro-expressions combined with brain and physiological signals to more reliably detect underlying emotions. We describe our models for measuring arousal and valence levels from a combination of facial micro-expressions, Electroencephalography (EEG) signals, galvanic skin responses (GSR), and Photoplethysmography (PPG) signals. We then evaluate our model using the DEAP dataset and our own dataset based on a subject-independent approach. Lastly, we discuss our results, the limitations of our work, and how these limitations could be overcome. We also discuss future directions for using facial micro-expressions and physiological signals in emotion recognition.