A fully automatic initialization approach for 3D-model-based vehicle tracking has been developed, based on Edge-Element and Optical-Flow association. An entire automatic initialization and tracking system incorporating this approach achieves results comparable to those obtained by earlier experiments based on semi-interactive initialization, provided the assessment criteria are roughly equivalent. Experiences with a large testing sample-about 15 minutes of inner-city traffic videos-are discussed in detail.
Abstract. An experimental comparison of 'Edge-Element Association (EEA) ' and 'Marginalized Contour (MCo)' approaches for 3D modelbased vehicle tracking in traffic scenes is complicated by the different shape and motion models with which they have been implemented originally. It is shown that the steering-angle motion model originally associated with EEA allows more robust tracking than the angular-velocity motion model originally associated with MCo. Details of the shape models can also make a difference, depending on the resolution of the images. Performance differences due to the choice of motion and shape model can outweigh the differences due to the choice of the tracking algorithm. Tracking failures of the two approaches, however, usually do not happen at the same frames, which can lead to insights into the relative strengths and weaknesses of the two approaches.
Motris, an integrated system for model-based tracking research, has been designed modularly to study the effects of algorithmic variations on tracking results. Motris attempts to avoid introducing bias into the relative assessment of alternative approaches. Such a bias may be caused by differences of implementation and parameterization if the component approaches are evaluated in separate testing environments. Tracking results are evaluated automatically on a significant test sample in order to quantify the effects of different combinations of alternatives. The Motris system environment thus allows an in-depth comparison between the so-called 'Edge-Element Association' approach documented in Haag and Nagel (1999) and the more recent 'Expectation-Maximization' approach reported by Pece and Worrall (2002).
Abstract. 3D-model-based tracking offers one possibility to explicate the manner in which spatial coherence can be exploited for the analysis of image sequences. Two seemingly different approaches towards 3D-model-based tracking are compared using the same digitized video sequences of road traffic scenes. Both approaches rely on the evaluation of greyvalue discontinuities, one based on a hypothesized probability distribution function for step-discontinuities in the vicinity of model-segments, the other one based on extraction of Edge Elements (EEs) and their association to model-segments. The former approach could be considered to reflect a stronger spatial coherence assumption because the figureof-merit function to be optimized collects evidence from all greyvalue discontinuities within a tolerance region around visible model segments. The individual association of EEs to model-segments by the alternative approach is based on a distance function which combines differences in position and orientation, thereby taking into account the gradient direction as well as the location of a local gradient maximum in gradient direction. A detailed analysis of numerous vehicles leads to the preliminary conclusion that both approaches have different strengths and weaknesses. It turns out that the effects of how greyvalue discontinuities are taken into account are in general less important than the inclusion of Optical Flow (OF) estimates during the update-step of the current state vector for a body to be tracked. OF estimates are evaluated only within the area of the body to be tracked when projected into the image plane according to the current state estimate. Subtle effects related to simplifications and approximations during the implementation of an approach thus may influence the aggregated result of tracking numerous vehicles even in the case where spatial coherence appears to be rigorously exploited.
Abstract. The textual description of video sequences exploits conceptual knowledge about the behavior of depicted agents. An explicit representation of such behavioral knowledge facilitates not only the textual description of video evaluation results, but can also be used for the inverse task of generating synthetic image sequences from textual descriptions of dynamic scenes. Moreover, it is shown here that the behavioral knowledge representation within a cognitive vision system can be exploited even for prediction of movements of visible agents, thereby improving the overall performance of a cognitive vision system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright 漏 2025 scite LLC. All rights reserved.
Made with 馃挋 for researchers
Part of the Research Solutions Family.