2022
DOI: 10.48550/arxiv.2204.00256
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Large-scale Dataset of (Open Source) License Text Variants

Stefano Zacchiroli

Abstract: We introduce a large-scale dataset of the complete texts of free/open source software (FOSS) license variants. To assemble it we have collected from the Software Heritage archive-the largest publicly available archive of FOSS source code with accompanying development history-all versions of files whose names are commonly used to convey licensing terms to software users and developers.The dataset consists of 6.5 million unique license files that can be used to conduct empirical studies on open source licensing,… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 21 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?