Smart sensing devices are furnished with an array of sensors, including locomotion sensors, which enable continuous and passive monitoring of human activities for the ambient assisted living. As a result, sensor-based human activity recognition has earned significant popularity in the past few years. A lot of successful research studies have been conducted in this regard. However, the accurate recognition of in-the-wild human activities in real-time is still a fundamental challenge to be addressed as human physical activity patterns are adversely affected by their behavioral contexts. Moreover, it is essential to infer a user's behavioral context along with the physical activity to enable context-aware and knowledge-driven applications in real-time. Therefore, this research work presents ''C2FHAR'', a novel approach for coarseto-fine human activity recognition in-the-wild, which explicitly models the user's behavioral contexts with activities of daily living to learn and recognize the fine-grained human activities. For addressing realtime activity recognition challenges, the proposed scheme utilizes a multi-label classification model for identifying in-the-wild human activities at two different levels, i.e., coarse or fine-grained, depending upon the real-time use-cases. The proposed scheme is validated with extensive experiments using heterogeneous sensors, which demonstrate its efficacy.