Raphaël Troncy scite author profile

Applying natural language processing for mining and intelligent information access to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Information extraction from tweets is typically performed in a pipeline, comprising consecutive stages of language identification, tokenisation, part-of-speech tagging, named entity recognition and entity disambiguation (e.g. with respect to DBpedia). In this work, we describe a new Twitter entity disambiguation dataset, and conduct an empirical analysis of named entity recognition and disambiguation, investigating how robust a number of state-of-the-art systems are on such noisy texts, what the main sources of error are, and which problems should be further investigated to improve the state of the art.

show abstract

LODE: Linking Open Descriptions of Events

Shaw

2009

View full text Add to dashboard Cite

Abstract. People conventionally refer to an action or occurrence taking place at a certain time at a specific location as an event. This notion is potentially useful for connecting individual facts recorded in the rapidly growing collection of linked data sets and for discovering more complex relationships between data. In this paper, we provide an overview and comparison of existing event models, looking at the different choices they make of how to represent events. We describe a model for publishing records of events as Linked Data. We present tools for populating this model and a prototype "event directory" web service, which can be used to locate stable URIs for events that have occurred, provide RDFS+OWL descriptions and link to related resources.

show abstract

COMM: Designing a Well-Founded Multimedia Ontology for the Web

et al. 2007

View full text Add to dashboard Cite

Abstract. Semantic descriptions of non-textual media available on the web can be used to facilitate retrieval and presentation of media assets and documents containing them. While technologies for multimedia semantic descriptions already exist, there is as yet no formal description of a high quality multimedia ontology that is compatible with existing (semantic) web technologies. We explain the complexity of the problem using an annotation scenario. We then derive a number of requirements for specifying a formal multimedia ontology before we present the developed ontology, COMM, and evaluate it with respect to our requirements. We provide an API for generating multimedia annotations that conform to COMM.

show abstract

The MeMAD Submission to the WMT18 Multimodal Translation Task

Grönroos¹,

Huet²,

Kurimo³

et al. 2018

View full text Add to dashboard Cite

This paper describes the MeMAD project entry to the WMT Multimodal Machine Translation Shared Task.We propose adapting the Transformer neural machine translation (NMT) architecture to a multi-modal setting. In this paper, we also describe the preliminary experiments with textonly translation systems leading us up to this choice.We have the top scoring system for both English-to-German and English-to-French, according to the automatic metrics for flickr18.Our experiments show that the effect of the visual features in our system is small. Our largest gains come from the quality of the underlying text-only NMT system. We find that appropriate use of additional data is effective.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Raphaël Troncy

Analysis of named entity recognition and linking for tweets

LODE: Linking Open Descriptions of Events

COMM: Designing a Well-Founded Multimedia Ontology for the Web

The MeMAD Submission to the WMT18 Multimodal Translation Task

Contact Info

Product

Resources

About