With the advent of multicore processors, the performance of software has been elevated to new unforeseen heights via parallelization. However, this has not been achieved without new problems cropping up due to parallelization. One serious issue is the performance bottleneck due to cache misses or resource starvation, which is hard to detect in application software especially when the software has dynamically changing behavior. Performance monitors are usually employed for such purposes. Nevertheless, monitors have introduced their own computation and communication overheads, especially in embedded multicore systems. In this work, we try to estimate the effects of monitor overheads on different types of applications, such as CPU-bound and IO-bound tasks. Further, we give suggestions on the number and type of monitors to use for such embedded multicore applications. Besides trying to reduce monitor overheads, we also aim for the accuracy and the immediacy of the monitored information. Through a real-world example, namely digital video recording system, we demonstrate how different monitoring periods affect the tradeoff between accuracy and immediacy of the monitored information.