We present eBDtheque, a database of various comic book images and their ground truth for panels, balloons and text lines plus semantic annotations. The database consists of a hundred pages of various comic book albums, Franco-Belgian, American comics and mangas. Additionally, we present the piece of software used to establish the ground truth and a tool to validate results against this ground truth. Everything is publicly available for scientific use on http://ebdtheque.univ-lr.fr.
Abstract-To reduce the gap between pixel data and thesaurus semantics, this paper presents a novel approach using mapping between two ontologies on images of drop-capitals (also named dropcaps or lettrines): In the first ontology, each dropcap image is endowed with semantic information describing its content. It is generated from a database of lettrines images -namely Ornamental Letter Images DataBase -manually populated by historians with dropcap images annotations. For the second ontology we have developed image processing algorithms to extract image regions on the basis of a number of features. These features, as well as spatial relations, among regions form the basis of the ontology. The ontologies are then enriched by inference rules to annotate some regions to automatically deduce their semantics. In this article, the method is presented together with preliminary experimental results and an illustrative example.
International audienceA huge number of historical documents have been digitized over the last ten years. Browsing into these collections can be done using query by keywords or query by example systems. Going from one kind of query to another raises the problem of the semantic gap. In order to deal with this problem, this paper presents an ontology-based approach to the resolution of the semantic gap problem that uses inference rules with historical images. To do this, historians’ knowledge and knowledge from the document processing domain were modeled using dedicated ontologies. Then, links between the regions of interest from the computer vision algorithms on the one hand, and their meaning on the other hand, were automatically created. These links will subsequently be used to help historians retrieve similar images. Based on the three ontologies defined and combined in this approach, we have defined rules to automatically annotate an image (to define the background for example) or a part of an image (to identify a letter, a body-part, …)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.