<p><strong>Abstract.</strong> The paper describes a workflow for generating LoD3 CityGML models (i.e. semantic building models with structured facades) based on textured LoD2 CityGML models by adding window and door objects. For each wall texture, bounding boxes of windows and doors are detected using “Faster R-CNN”, a deep neural network. We evaluate results for textures with different resolutions on the ICG Graz50 facade dataset. In general, detected bounding boxes match very well with the rectangular shape of most wall openings. Thus, no further classification of shapes is required. Windows are typically aligned to rows and columns, and only a few different types of windows exist for each facade. However, the neural network proposes rectangles of varying sizes, which are not always aligned perfectly. Thus, we use post-processing to obtain a more realistic appearance of facades. Window and door rectangles get aligned by solving a mixed integer linear optimization problem, which automatically leads to a clustering of these openings into few different classes of window and door types. Furthermore, an a-priori knowledge about the number of clusters is not required.</p>