In this article we explore the role of pre-reflective, embodied, and interactive intentionality in joint musical performance. Putting together insights from phenomenology and current theories in cognitive science, we present a case study based on qualitative interviews with the Danish String Quartet (DSQ). A total of 12 hours of interviews was recorded, drawing on ethnography-related methodologies during tours with the DSQ in Denmark and England in 2012 and 2013, focusing mainly on their experience of perception, intentionality, absorption, selfhood and intersubjectivity. The analysis emerging from our data suggests that expert musicians' experience of collective music-making is rooted in the dynamical patterns of perception and action that co-constitute the sonic environment(s) in which they are embedded, and that the role of attention and other reflective processes should therefore be reconsidered. In putting forward our view on ensemble cohesion, we challenge Keller's and Seddon and Biasutti's influential positions, maintaining that the cognitive processes at play in such intersubjective context are grounded in the concrete (inter)actions of the players, and are not reducible to processes and structures 'in the head'. We argue that this is a significant step forward from more traditional accounts of joint musical performances, which often involve mental representations as principal explanatory tools -downplaying the embodied and participatory dimension of music-making -and we conclude that ensemble performance can take place without attention to either shared goals, or to the other ensemble musicians. We finally suggest that if other researchers want to understand what it is like to play with other musicians then they must shift their focus from Joint Musical Attention (JMA) to Joint Musical Experience (JME), facilitating the development of more ecologically valid models of collective musical performance.