2014 IEEE Conference on Computer Vision and Pattern Recognition 2014
DOI: 10.1109/cvpr.2014.48
|View full text |Cite
|
Sign up to set email alerts
|

Max-Margin Boltzmann Machines for Object Segmentation

Abstract: We present Max-Margin Boltzmann Machines (MMBMs) for object segmentation. MMBMs are essentially a class of Conditional Boltzmann Machines that model the joint distribution of hidden variables and output labels conditioned on input observations. In addition to image-to-label connections, we build direct image-to-hidden connections to facilitate global shape prediction, and thus derive a simple Iterated Conditional Modes algorithm for efficient maximum a posteriori inference. We formulate a max-margin objective … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 19 publications
(23 citation statements)
references
References 23 publications
0
23
0
Order By: Relevance
“…Caltech-UCSD Birds 200 Caltech-UCSD Birds 200 (Welinder et al 2010) is a dataset with 6033 images of birds with 128 × 128 resolutions, split into 3000 train and 3033 test images. As a second source, we use segmentation masks provided by (Yang, Safar, and Yang 2014). On this dataset we assess whether learning with multiple modalities can be advantageous in scenarios where we are interested only in one particular modality.…”
Section: Learning and Inference Of Shared Representations For Structumentioning
confidence: 99%
“…Caltech-UCSD Birds 200 Caltech-UCSD Birds 200 (Welinder et al 2010) is a dataset with 6033 images of birds with 128 × 128 resolutions, split into 3000 train and 3033 test images. As a second source, we use segmentation masks provided by (Yang, Safar, and Yang 2014). On this dataset we assess whether learning with multiple modalities can be advantageous in scenarios where we are interested only in one particular modality.…”
Section: Learning and Inference Of Shared Representations For Structumentioning
confidence: 99%
“…The object is treated as a cluster of pixels in the same category instead of a whole, which usually ignores the overall appearances and shapes of the objects. Some other models try to make structured label prediction by using RBM, like CHOPPS (Li, Tarlow, and Zemel 2013), GLOC (Kae et al 2013) and MMRBM (Yang, Safar, and Yang 2014). However, the object being recognizable is never under consideration in these models, nor are they capable of recognition.…”
Section: Simultaneous Segmentation and Classificationmentioning
confidence: 99%
“…For comparison, we evaluate the fore-and background segmentation performances by some existing models that employ RBM, such as GLOC (Kae et al 2013), CHOPPS (Li, Tarlow, and Zemel 2013) and MMBM (Yang, Safar, and Yang 2014), and DeepLab (Chen et al 2016). We also tested the framework Fig.3 implemented with a standard CNN encoder (and the classification layer), as the baseline model, following (Sabour, Frosst, and Hinton 2017).…”
Section: Experiments Signs and Logosmentioning
confidence: 99%
“…We extract color information from the foreground region based on the results using an max-margin segmentation method [30]. As large granular spatial decomposition is likely to cause misalignment due to pose changes, we follow recent schemes for re-identification [24,35] and partition an image into six horizontal stripes.…”
Section: Representationmentioning
confidence: 99%