Yunosuke Higashi scite author profile

Yunosuke Higashi

4Publications

14Citation Statements Received

82Citation Statements Given

How they've been cited

How they cite others

Affiliations

Japan Research Institute, Wakayama University

Publications

Order By: Most citations

Hierarchical Clustering of OSS License Statements toward Automatic Generation of License Rules

Higashi

Ohira

Kashiwa

et al. 2019

Journal of Information Processing

View full text Add to dashboard Cite

Reusing open source software (OSS) components for one's own software products has become common in the modern software development. Automated license identification tools have been proposed to help developers identify OSS licenses, since a large number of licenses sometimes must be checked before attempting to reuse. Of the existing tools, Ninka [1] can most correctly identify licenses of each source file by using regular expressions. In case Ninka does not have license identification rules for unknown licenses, Ninka reports these as "unknown licenses" which must be checked by developers manually. Since completely-new or derived OSS licenses appear nearly every year, a license identification tool should be appropriately maintained by adding regular expressions corresponding to the new licenses. The final goal of our study is to construct a method to automatically create candidate license rules to be added to a license identification tool such as Ninka. Toward achieving the goal, files identified as unknown licenses must be classified by license firstly. In this paper, we propose a hierarchical clustering which divides unknown licenses into clusters of files with a single license. We conduct a case study to confirm the usefulness of our clustering method when it is applied for classifying 2,801, 1,230 and 2,446 unknown license statement files for Linux Kernel v4.4.6, FreeBSD v10.3.0 and Debian v7.8.0 respectively. As a result, it is confirmed that our method can create clusters which are suitable as candidates for generating license rules automatically.

show abstract

Clustering OSS License Statements toward Automatic Generation of License Rules

Higashi

Manabe

Ohira

2016

View full text Add to dashboard Cite

Reusing open source software (OSS) components for own software products has become common in the modern software development. Automated license identification tools has been proposed to help developers identify OSS licenses, since a large number of licenses sometimes must be checked to be reused. Of the existing tools, Ninka [1] can most correctly identify licenses of each source file by using regular expressions. In case Ninka does not have license identification rules for unknown licenses, Ninka reports they are "unknown licenses" which must be checked by developers manually. Since completelynew or derived OSS licenses appear nearly every year, a license identification tool should be appropriately maintained by adding regular expressions corresponding to the new licenses. The final goal of our study is to construct a method to automatically create candidates of license rules to be added to a license identification tool such as Ninka. Toward achieving the goal, files identified as unknown licenses must be classified by license firstly. In this paper, we propose a hierarchical clustering which divides unknown licenses into clusters of files with a single license. We conduct a case study to confirm the usefulness of our clustering method when it is applied for classifying 2,838 unknown license files of Debian v7.8.0. As a result, it is confirmed that our method can create clusters which are suitable as candidates for generating license rules automatically.

show abstract

Automating License Rule Generation to Help Maintain Rule-based OSS License Identification Tools

Higashi

Ohira

Manabe

2023

Journal of Information Processing

View full text Add to dashboard Cite

Many license identification tools have been proposed to support OSS reuse. License identification tools automatically identify OSS licenses declared in source files. Ninka is one of the most accurate license identification tools. Because OSS licenses are often newly created or inherited, rules built into license identification tools need to be created and updated on a regular basis. However, when a large number of unknown licenses are detected in large OSS products, it is not easy to manually create new rules. In our previous studies, we proposed a method for clustering license statements that Ninka determined to be unknown. In this paper, we propose a method to automatically generate license rules from the clustered license statements. Our approach further filters the license statements from the created clusters to extract sequential patterns and converts the extracted patterns into regular expressions. We conducted conduct a case study where our method is applied to 1,821, 3,561 and 2,838 unknown license statement files respectively collected from FreeBSD v10.3.0, Linux Kernel v4.4.6, and Debian v7.8.0, to confirm the usefulness of our method. As a result, we confirmed that our method successfully generated license rules that take into consideration the orthographical variants and that our method also efficiently identified licenses with a small number of license rules. Furthermore, we found that adding the license rules generated by our method to Ninka improves the licensing rule performance.

show abstract

A Preliminary Analysis of GPL-Related License Violations in Docker Images

Higashi

Fukui

Kashiwa

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.