Jun Harashima scite author profile

Jun Harashima

5Publications

9Citation Statements Received

43Citation Statements Given

How they've been cited

How they cite others

Affiliations

Kyoto University, Tokyo Metropolitan University, Shibuya (Japan)

Publications

Order By: Most citations

Two-step validation in character-based ingredient normalization

Harashima¹,

Yamada²

2018

View full text Add to dashboard Cite

Although ingredients are important items of information in recipes, it is difficult to process them, especially for computers, because they are user-generated informal text. To normalize ingredients, we can use a character-based encoder-decoder model that takes the character sequence of an ingredient as an input and outputs its canonical form. However, the model still has two problems: The first is that the model often generates unnatural sequences as outputs. The second problem is that the generated sequences are sometimes unrelated to the original ingredient. Therefore, we propose a two-step validation to generate better normalizations. In the first validation step, we use a trie to limit the normalization candidates to existing sequences. In the second validation step, we rerank the normalization candidates based on their similarity to the original ingredient. We conducted experiments using a corpus that includes approximately 10 thousand pairs of ingredients and their canonical forms and showed that our proposed validation improved the performance of encoder-decoder models. CCS CONCEPTS • Computing methodologies → Natural language processing;

show abstract

Step or Not: Discriminator for The Real Instructions in User-generated Recipes

Inuzuka¹,

Ito

Harashima

2018

View full text Add to dashboard Cite

Calorie Estimation in a Real-World Recipe Service

Harashima¹,

Hiramatsu²,

Sanjo³

2020

AAAI

View full text Add to dashboard Cite

Cooking recipes play an important role in promoting a healthy lifestyle, and a vast number of user-generated recipes are currently available on the Internet. Allied to this growth in the amount of information is an increase in the number of studies on the use of such data for recipe analysis, recipe generation, and recipe search. However, there have been few attempts to estimate the number of calories per serving in a recipe. This study considers this task and introduces two challenging subtasks: ingredient normalization and serving estimation. The ingredient normalization task aims to convert the ingredients written in a recipe (e.g.,), which says “sesame oil (for finishing)” in Japanese) into their canonical forms (e.g., , sesame oil) so that their calorific content can be looked up in an ingredient dictionary. The serving estimation task aims to convert the amount written in the recipe (e.g., N, N pieces) into the number of servings (e.g., M, M people), thus enabling the calories per serving to be calculated. We apply machine learning-based methods to these tasks and describe their practical deployment in Cookpad, the largest recipe service in the world. A series of experiments demonstrate that the performance of our methods is sufficient for use in real-world services.

show abstract

Relevance Feedback using Surface and Latent Information in Texts

Harashima

Kurohashi

2014

Journal of Natural Language Processing

View full text Add to dashboard Cite

Most relevance feedback methods re-rank search results using only the information of surface words in texts. We present a method that uses not only the information of surface words but also that of latent words that are inferred from texts. We infer latent word distribution in each document in the search results using latent Dirichlet allocation (LDA). When feedback is given, we also infer the latent word distribution in the feedback using LDA. We calculate the similarities between the user feedback and each document in the search results using both the surface and latent word distributions and re-rank the search results on the basis of the similarities. Evaluation results show that when user feedback consisting of two documents (3, 589 words) is given, the proposed method improves the initial search results by 27.6% in precision at 10 (P@10). Additionally, it proves that the proposed method can perform well even when only a small amount of user feedback is available. For example, an improvement of 5.3% in P@10 was achieved when user feedback constituted only 57 words.

show abstract

Non-ingredient Detection in User-generated Recipes using the Sequence Tagging Approach

Yamaguchi¹,

Inuzuka²,

Hiramatsu³

et al. 2020

View full text Add to dashboard Cite

Recently, the number of user-generated recipes on the Internet has increased. In such recipes, users are generally supposed to write a title, an ingredient list, and steps to create a dish. However, some items in an ingredient list in a user-generated recipe are not actually edible ingredients. For example, headings, comments, and kitchenware sometimes appear in an ingredient list because users can freely write the list in their recipes. Such noise makes it difficult for computers to use recipes for a variety of tasks, such as calorie estimation. To address this issue, we propose a non-ingredient detection method inspired by a neural sequence tagging model. In our experiment, we annotated 6, 675 ingredients in 600 user-generated recipes and showed that our proposed method achieved a 93.3 F1 score.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jun Harashima

Two-step validation in character-based ingredient normalization

Step or Not: Discriminator for The Real Instructions in User-generated Recipes

Calorie Estimation in a Real-World Recipe Service

Relevance Feedback using Surface and Latent Information in Texts

Non-ingredient Detection in User-generated Recipes using the Sequence Tagging Approach

Contact Info

Product

Resources

About