Marilena Oita scite author profile

Marilena Oita

4Publications

10Citation Statements Received

55Citation Statements Given

How they've been cited

How they cite others

Affiliations

Novartis (Switzerland), Dalle Molle Institute for Artificial Intelligence Research, University of Applied Sciences and Arts of Southern Switzerland

Publications

Order By: Most citations

Cross-Fertilizing Deep Web Analysis and Ontology Enrichment

Oita¹,

Amarilli²,

Senellart³

2017

Preprint

View full text Add to dashboard Cite

Deep Web databases, whose content is presented as dynamicallygenerated Web pages hidden behind forms, have mostly been left unindexed by search engine crawlers. In order to automatically explore this mass of information, many current techniques assume the existence of domain knowledge, which is costly to create and maintain. In this article, we present a new perspective on form understanding and deep Web data acquisition that does not require any domain-specific knowledge. Unlike previous approaches, we do not perform the various steps in the process (e.g., form understanding, record identification, attribute labeling) independently but integrate them to achieve a more complete understanding of deep Web sources. Through information extraction techniques and using the form itself for validation, we reconcile input and output schemas in a labeled graph which is further aligned with a generic ontology. The impact of this alignment is threefold: first, the resulting semantic infrastructure associated with the form can assist Web crawlers when probing the form for content indexing; second, attributes of response pages are labeled by matching known ontology instances, and relations between attributes are uncovered; and third, we enrich the generic ontology with facts from the deep Web.

show abstract

Semantically Corroborating Neural Attention for Biomedical Question Answering

Oita

Vani

Oezdemir-Zaech

2020

View full text Add to dashboard Cite

Reverse Engineering Creativity into Interpretable Neural Networks

Oita

2019

View full text Add to dashboard Cite

Forest

Oita

Senellart

2015

View full text Add to dashboard Cite

Content-intensive websites, e.g., of blogs or news, present pages that contain Web articles automatically generated by content management systems. Identification and extraction of their main content is critical in many applications, such as indexing or classification. We present a novel unsupervised approach for the extraction of Web articles from dynamically-generated Web pages. Our system, called FOREST, combines structural and information-based features to target the main content generated by a Web source, and published in associated Web pages. We extensively evaluate FOREST with respect to various baselines and datasets, and report improved results over state-of-the art techniques in content extraction.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Marilena Oita

Cross-Fertilizing Deep Web Analysis and Ontology Enrichment

Semantically Corroborating Neural Attention for Biomedical Question Answering

Reverse Engineering Creativity into Interpretable Neural Networks

Forest

Contact Info

Product

Resources

About