Real estate represents a major share of economic activities and wealth in all economies. Due to the lack of widely acknowledged standards, however, the structuring, providing and managing of a life cycle-comprehensive building documentation yet remain challenging. Based on the empirical analysis of 8965 digital documents from 14 properties of 8 different owners, the article presents a model that will unify existing approaches and lead to the development of a document classification standard. This provides the basis for software systems to process relevant data and create timely information over the entire life cycle of a building. Further, it is shown that automated information extraction through artificial intelligence will become instrumental for enhanced and innovative business models and products in real estate such as automated data validation and data evaluation, documentation review, benchmarking and other analytical applications.
PurposeThis research provides fundamentals for generating (partially) automated standardized due diligence reports. Based on original digital building documents from (institutional) investors, the potential for automated information extraction through machine learning algorithms is demonstrated. Preferred sources for key information of technical due diligence reports are presented. The paper concludes with challenges towards an automated information extraction in due diligence processes.Design/methodology/approachThe comprehensive building documentation including n = 8,339 digital documents of 14 properties and 21 technical due diligence reports serve as a basis for identifying key information. To structure documents for due diligence, 410 document classes are derived and documents principally checked for machine readability. General rules are developed for prioritized document classes according to relevance and machine readability of documents.FindingsThe analysis reveals that a substantial part of all relevant digital building documents is poorly suited for automated information extraction. The availability and content of documents vary greatly from owner to owner and between document classes. The prioritization of document classes according to machine readability reveals potentials for using artificial intelligence in due diligence processes.Practical implicationsThe paper includes recommendations for improving the machine readability of documents and indicates the potential for (partially) automating due diligence processes. Therefore, document classes are derived, reviewed and prioritized. Transaction risks can be countered by an automated check for completeness of relevant documents.Originality/valueThis paper is the first published (empirical) research to specifically assess the automated digital processing of due diligence reports. The findings are helpful for improving due diligence processes and, more generally, promoting the use of machine learning in the property sector.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.