In this paper, a novel framework for multimodal search and retrieval of rich media objects is presented. The searchable items are media representations consisting of multiple modalities, such as 2D images, 3D objects and audio files, which share a common semantic concept. A manifold learning technique based on Laplacian Eigenmaps was appropriately modified in order to merge the low-level descriptors of each separate modality and create a new low-dimensional multimodal feature space, where all media objects can be mapped irrespective of their constituting modalities. To accelerate search and retrieval and make the framework suitable even for web-scale applications, a multimedia indexing scheme is adopted, which represents each object of the dataset by the ordering of a number of reference objects. Moreover, the hubness property is introduced in this paper as a criterion to select the most representative reference objects, thus, achieving the maximum possible performance of indexing. The content-based similarity of the multimodal descriptors is also used to automatically annotate the objects of the dataset using a predefined set of attributes. Annotation propagation is utilized to approximate the multimodal descriptors for multimodal queries that do not belong to the dataset.
Driven by the needs of customers and industry, online fashion search and analytics are recently gaining much attention. As fashion is mostly expressed by visual content, the analysis of fashion images in online social networks is a rich source of possible insights on evolving trends and customer preferences. Although a plethora of visual content is available, the modeling of clothes’ physics and movement, the implicit semantics in fashion designs, and the subjectivity of their interpretation pose difficulties to fully automated solutions for fashion search and analysis. In this article, we present the design and evaluation of a crowd-powered system for fashion similarity search from Twitter, supporting trend analysis for fashion professionals. The system enables fashion similarity search based on specific human-based similarity criteria. This is achieved by implementing a novel machine--crowd workflow that supports complex tasks requiring highly subjective judgments where multiple true solutions may coexist. We discuss how this leads to a novel class of crowd-powered systems for which the output of the crowd is not used to verify the automatic analysis but is the desired outcome. Finally, we show how this kind of crowd involvement enables a novel kind of similarity search and represents a crucial factor for the acceptance of system results by the end user.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.