Hardware architectures based on multiple cores, cache memory and branch prediction usually preclude the application of classical methods for determining execution time bounds for real-time tasks. As such bounds are fundamental in the designing of real-time systems, Measurement-Based Probabilistic Timing Analysis (MBPTA) has been employed. A common choice is the derivation of probabilistic Worst-Case Execution Time (pWCET) via the use of Extreme Value Theory (EVT), a branch of statistics used to estimate the probability of rare events that are more extreme than observations. However, pWCET estimations are usually reported in a controlled or simulated environment. In this paper we apply MBPTA in a real multi-core platform, namely Raspberry Pi 3B, taking into consideration possible interference due to operating system and concurrent activities. The results indicate that although EVT is effective, it does not always produce adequate models and coherent pWCET estimations. As MBPTA is primarily called for when classical methods are not applicable, as it is the case for the studied platform, the results reported in this paper highlight risks and vulnerabilities when applying MBPTA-EVT for pWCET inference * .