The advent of a panoply of resource-limited devices opens up new challenges in the design of computer vision algorithms with a clear compromise between accuracy and computational requirements. This thesis proposes several low-level algorithms on top of which new applications can be built with better performance for limited devices. We address the problems of local feature detection and description that are the fundamental cornerstone of many computer vision pipelines.We first propose ELSED, the fastest line segment detector in the literature. The key for its efficiency is a local segment growing algorithm that connects gradient-aligned pixels in presence of small discontinuities. ELSED not only improves the execution time but also the accuracy of the competitors with similar computational requirements.Next, we introduce FSG, a method to group small segments into full lines that are more suitable for some tasks like vanishing point estimation. It is based on two independent components. A proposer that greedily clusters segments suggesting plausible line candidates and a probabilistic model that decides if a group of segments is an actual line. Unlike its competitors, FSG is able to group segments in real-time achieving state-of-the-art performance.Last, we study the problem of efficient local feature description where we propose several methods. We introduce an efficient feature description measurement based on the difference of mean gray levels between two square regions and a fast procedure to search for its optimal configuration. In our simplest proposals: BELID, and BEBLID, we select the discriminative measurements by solving a binary classification problem with boosting. Our most elaborated and top-performing descriptors are BAD (Box Average Difference) and HashSIFT. They emerge from the application of triplet ranking loss, hard negative mining, and anchor swapping to features based on pixels differences, such as the one we introduce in this thesis, and image gradients. In our experiments, we evaluate the accuracy, execution time, and energy consumption of the proposed descriptors. We show that their results establish new operating points in the state-of-the-art's accuracy vs. resources trade-off curve.The effectiveness of these methods is also supported by their adoption in the industry and the computer vision community. Specifically, as part of the Industrial PhD grant, the code has been integrated as a fundamental component in the pipeline of a visual localization system and the open-source code has been published in the OpenCV library. vii 1: Input: a, d 0 , G, O, V 2: a: anchor point 3: d 0 ;d 0 ∈ {U, D, R, L} -initial walking direction 4: G -Gradient image magnitudes 5: O -Quantized image edge orientations 6: V -Visited pixels (marked as edge) 7: S ← ∅ 8: stack ← ∅ 9: stack.push([a, d 0 ]) 10: P ← {a} 11: E ← ∅ 12: while stack = ∅ do 13: segmentFound ← false 14: stop ← false 15: c, d ← stack.top() 16: stack.pop() 17:p ← previousPixel (c, d)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.