2022
DOI: 10.1145/3531533
|View full text |Cite
|
Sign up to set email alerts
|

Design and Implementation of a Historical German Firm-level Financial Database

Abstract: Broad, long-term financial, and economic datasets are scarce resources, particularly in the European context. In this paper, we present an approach for an extensible data model that is adaptable to future changes in technologies and sources. This model may constitute a basis for digitized and structured long-term historical datasets for different jurisdictions and periods. The data model covers the specific peculiarities of historical financial and economic data and is flexible enough to reach out for data of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 39 publications
0
2
0
Order By: Relevance
“…This use case refers to the matching of entities from repeated annual cross-sections of firm data extracted via OCR from historical German yearbooks. Gram et al (2022) describe the data extraction process, detail the underlying relational data model, and provide summary statistics for a subset of the fields extracted for the period 1920 -1932. The database is implemented according to the FAIR principles and will be made fully available the the public in the upcoming years.…”
Section: A Use Case: Ocr-extracted Historical Firm-level Datamentioning
confidence: 99%
See 1 more Smart Citation
“…This use case refers to the matching of entities from repeated annual cross-sections of firm data extracted via OCR from historical German yearbooks. Gram et al (2022) describe the data extraction process, detail the underlying relational data model, and provide summary statistics for a subset of the fields extracted for the period 1920 -1932. The database is implemented according to the FAIR principles and will be made fully available the the public in the upcoming years.…”
Section: A Use Case: Ocr-extracted Historical Firm-level Datamentioning
confidence: 99%
“…Finally, we present an application of our matching framework in a domain with dirty firm-level financial data that we extracted from historical archives by using Optical Character Recognition (OCR) software (Kamlah et al, 2022). The data represent German firms operating in the period from 1910 to 1919 with non-harmonized and non-standardized attributes extracted from the "Handbuch der deutschen Aktiengesellschaften" (see also Gram et al, 2022). In a 5-fold cross-validation with 30% train and 70% test random sample splits, our framework achieves an average 99.36 F-score in the test sub-sample.…”
Section: Introductionmentioning
confidence: 99%