Thanks to the availability of wearable devices such as GoPro cameras, smart phones, and glasses, we have now access to a plethora of videos captured from the first person perspective. Surveillance cameras and Unmanned Aerial Vehicles (UAVs) also offer tremendous amounts of video data recorded from top and oblique view points. Egocentric and surveillance vision have been studied extensively but separately in the computer vision community. The relationship between these two domains, however, remains unexplored. In this study, we make the first attempt in this direction by addressing two basic yet challenging questions. First, having a set of egocentric videos and a top-view video, does the top-view video contain all or some of the egocentric viewers? In other words, have these videos been shot in the same environment at the same time? Second, if so, how can we identify the egocentric viewers in the top-view video? These problems can become even more challenging when videos are not temporally aligned. We model each view using a graph, and compute the assignment and time-delays in an iterative-alternative fashion using spectral graph matching and time delay estimation. Such an approach handles the temporal misalignment between the egocentric videos and the top-view video.