Jointly Optimizing 3D Model Fitting and Fine-Grained Classification

Lin, Yen-Liang; Morariu, Vlad I.; Hsu, Winston H.; Davis, Larry S.

doi:10.1007/978-3-319-10593-2_31

Cited by 102 publications

(103 citation statements)

References 23 publications

Supporting

Mentioning

100

Contrasting

Unclassified

Order By: Relevance

“…Taniai Benchmark [45] We first evaluated our FCSS descriptor on the Taniai benchmark [45], which consists of 400 image pairs divided into three groups: FG3DCar [29], JODS [37], and PASCAL [20]. As in [45], flow accuracy was measured by computing the proportion of foreground Figure 6.…”

Section: Resultsmentioning

confidence: 99%

FCSS: Fully Convolutional Self-Similarity for Dense Semantic Correspondence

Kim

Min

Ham

et al. 2019

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

We present a descriptor, called fully convolutional selfsimilarity (FCSS), for dense semantic correspondence. To robustly match points among different instances within the same object class, we formulate FCSS using local selfsimilarity (LSS) within a fully convolutional network. In contrast to existing CNN-based descriptors, FCSS is inherently insensitive to intra-class appearance variations because of its LSS-based structure, while maintaining the precise localization ability of deep neural networks. The sampling patterns of local structure and the self-similarity measure are jointly learned within the proposed network in an end-to-end and multi-scale manner. As training data for semantic correspondence is rather limited, we propose to leverage object candidate priors provided in existing image datasets and also correspondence consistency between object pairs to enable weakly-supervised learning. Experiments demonstrate that FCSS outperforms conventional handcrafted descriptors and CNN-based descriptors on various benchmarks.

show abstract

Section: Resultsmentioning

confidence: 99%

FCSS: Fully Convolutional Self-Similarity for Dense Semantic Correspondence

Kim

Min

Ham

et al. 2019

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

show abstract

“…However, the results of Zia et al (2013) show that their approach heavily depends on a good pose initialisation. Similarly, Lin et al (2014) recover the 3D vehicle geometry by fitting the 3D ASM to estimated 2D landmark locations resulting from a DPM detector. Their approach also suffers from wrongly estimated part locations resulting from the DPM.…”

Section: Related Workmentioning

confidence: 99%

Recovering the 3d Pose and Shape of Vehicles From Stereo Images

Coenen

Rottensteiner

Heipke

2018

ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

ABSTRACT:The precise reconstruction and pose estimation of vehicles plays an important role, e.g. for autonomous driving. We tackle this problem on the basis of street level stereo images obtained from a moving vehicle. Starting from initial vehicle detections, we use a deformable vehicle shape prior learned from CAD vehicle data to fully reconstruct the vehicles in 3D and to recover their 3D pose and shape. To fit a deformable vehicle model to each detection by inferring the optimal parameters for pose and shape, we define an energy function leveraging reconstructed 3D data, image information, the vehicle model and derived scene knowledge. To minimise the energy function, we apply a robust model fitting procedure based on iterative Monte Carlo model particle sampling. We evaluate our approach using the object detection and orientation estimation benchmark of the KITTI dataset (Geiger et al., 2012). Our approach can deal with very coarse pose initialisations and we achieve encouraging results with up to 82 % correct pose estimations. Moreover, we are able to deliver very precise orientation estimation results with an average absolute error smaller than 4• .

show abstract

“…Top: A series of prior 3D shape basis [2]. Bottom: The shape estimation procedure for a given input image.…”

Section: Input Image Landmarks Localisationmentioning

confidence: 99%

Robust 3D Car Shape Estimation from Landmarks in Monocular Image

Miao¹,

Tao²,

Lu³

2016

Procedings of the British Machine Vision Conference 2016

View full text Add to dashboard Cite

Input Image Landmarks LocalisationModel Fitting Estimated 3D ShapeFigure 1: The framework for 3D shape estimation. Top: A series of prior 3D shape basis [2]. Bottom: The shape estimation procedure for a given input image.Estimation of the 3D shape of a object from monocular image is an under-determined problem, which becomes harder when the observations are severely contaminated. In this paper, we propose a robust model to estimate 3D shape X from 2D landmarks x ∈ R 2×p with unknown camera pose M. The 3D shape of the object is assumed as a linear combination of predefined shape basisTo estimate s and M, we fit the model by minimizing the error between the observations x and the projected model points MX (as shown in Figure 1).Model. To address the outliers in the observed 2D points, which result from the complex background and illumination conditions, we propose a robust 3D shape estimation model. We explicitly model the outliers with an additional sparse error term E ∈ R 2×p . Thus, the robust model is then formulated aswhere t = [t x ,t y ] T · 1 1×p is the translation, and λ , η are the regularization parameters, and µ is the mean shape. The objective function in (1) is non-convex and non-smooth constrained on Stiefel manifold, where the coupling of the unknown shape representation coefficients s and camera pose M makes it more difficult to be solved. Method. We propose an efficient numerical algorithm based on Alternative Direction Method of Multipliers (ADMM) [1] to solve this problem. With an auxiliary variable V ∈ R 2×3 introduced, the augmented Lagrangian is,where Λ is the multiplier and τ is penalty parameter. We update each block with all the others fixed. Based on some analysis on non-convex optimization of ADMM [3], we set the orthogonality constraints into the smooth sub-problem (V -minimization),The closed-form solution is given by V k+1 = UI 2×3 W T , where U andThe other sub-problems can be easily solved. Both the optimization of M and t admit closed-form solutions. The updating of s is a Lasso-problem, and the sparse error pattern E can be efficiently solved by element-wise soft-thresholding. The convergences of ADMM with more than two blocks cannot be always guaranteed [1], and may be influenced by the update ordering. We set a fixed update ordering that can always lead convergence in our experiments.

show abstract

Jointly Optimizing 3D Model Fitting and Fine-Grained Classification

Cited by 102 publications

References 23 publications

FCSS: Fully Convolutional Self-Similarity for Dense Semantic Correspondence

FCSS: Fully Convolutional Self-Similarity for Dense Semantic Correspondence

Recovering the 3d Pose and Shape of Vehicles From Stereo Images

Robust 3D Car Shape Estimation from Landmarks in Monocular Image

Contact Info

Product

Resources

About