The microservices architectural style aims at improving software maintenance and scalability by decomposing applications into independently deployable components. A common criticism about this style is the risk of increasing response times due to communication, especially with very granular entities. Locality‐aware placement of microservices onto the underlying hardware can contribute to keeping response times low. However, the complex graphs of invocations originating from users' calls largely depend on the specific workload (e.g., the length of an invocation chain could depend on the input parameters). Therefore, many existing approaches are not suitable for modern infrastructures where application components can be dynamically redeployed to take into account user expectations. This paper contributes to overcoming the limitations of static or off‐line techniques by presenting a big data pipeline to dynamically collect tracing data from running applications that are used to identify a given number of microservices groups whose deployment allows keeping low the response times of the most critical operations under a defined workload. The results, obtained in different working conditions and with different infrastructure configurations, are presented and discussed to draw the main considerations about the general problem of defining boundary, granularity, and optimal placement of microservices on the underlying execution environment. In particular, they show that knowing how a specific workload impacts the constituent microservices of an application, helps achieve better performance, by effectively lowering response time (e.g., up to a reduction), through the exploitation of locality‐driven clustering strategies for deploying groups of services.