Abstract-Various sensing systems have been exploited to monitor in-person interactions, one of the most important indicators of mental health. However, existing solutions either require deploying in-situ infrastructure or fail to provide detailed information about a person's involvement during interactions.In this paper, we use smartphones and on-body sensors to monitor in-person interactions without relying on any in-situ infrastructure. By using state-of-art smartphones and on-body sensors, we implement a multi-modal system that collects a battery of features to better monitor in-person interactions. In addition, unlike existing work that monitors interactions only based on data collected from one person, we emphasize that in-person interactions intrinsically involve multiple participants, and thus we aggregate information from nearby people to identify more interaction details. Evaluation shows our solution accurately detects various in-person interactions and provides insights absent in existing systems.