In many documents, like receipts or invoices, textual information is constrained by the space and organization of the document. The document information has no natural language context, and expressions are often abbreviated to respect the graphical layout, both at word level and phrase level. In order to analyze the semantic content of these types of document, we need to understand each phrase, and particularly each name of sold products. In this paper, we propose an approach to find the right expansion of abbreviations and acronyms, without context. First, we extract information about sold products from our receipts corpus and we analyze the different linguistic processes of abbreviation. Then, we retrieve a list of expanded names of products sold by the company that emitted receipts, and we propose an algorithm to pair extracted names of products with the corresponding expansions. We provide the research community with a unique document collection for abbreviation expansion.
La huitième édition du Forum Jeunes Chercheurs du congrès INFORSID s'est déroulée en 2016 à Grenoble. Cette édition a accueilli 19 doctorant sélectionnés parmi 32 candidats, de première ou deuxième année, effectuant leur recherche dans le domaine des systèmes d'information. Cet article coordonné par Cécile Favre (responsable de l'organisation du Forum) présente une sélection des quatre meilleures contributions à ce forum. ABSTRACT. The eighth edition of the Forum Jeunes Chercheurs of the INFORSID congress held in 2016 in Grenoble, France. It hosted 19 first-year or second-year PhD students, among 32 candidates, working in the Information Systems field. This article coordinated by Cécile Favre (in charge of the organisation of the Forum) presents a selection of the four best contributions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.