Abstract:The wisdom of 'smart' development increasingly shapes urban sustainability in Europe and beyond. Yet, the 'smart city' paradigm has been critiqued for favouring technological solutions and business interests over social inclusion and urban innovation. Despite the rhetoric of 'citizen-centred approaches' and 'user-generated data', the level of stakeholder engagement and public empowerment is still in question. It is unclear how smart city initiatives are developing common visions according to the principles of sustainable urban development. This paper examines how data governance in particular is framed in the new smart city agenda that is focused on sustainability. The challenges and opportunities of data governance in sustainability-driven smart city initiatives are articulated within a conceptual Framework on Sustainable Smart City Data Governance. Drawing on three cases from European countries and a stakeholder survey, the paper shows how governance of data can underpin urban smart and sustainable development solutions. The paper presents insights and lessons from this multi-case study, and discusses risks, challenges, and future research.
It is held as a truism that deep neural networks require large datasets to train effective models. However, large datasets, especially with high-quality labels, can be expensive to obtain. This study sets out to investigate (i) how large a dataset must be to train well-performing models, and (ii) what impact can be shown from fractional changes to the dataset size. A practical method to investigate these questions is to train a collection of deep neural answer selection models using fractional subsets of varying sizes of an initial dataset. We observe that dataset size has a conspicuous lack of effect on the training of some of these models, bringing the underlying algorithms into question. This is the authors version of the work. It is posted here for your personal use. The definitive version is published in:
Synthetic data generation is important to training and evaluating neural models for question answering over knowledge graphs. The quality of the data and the partitioning of the datasets into training, validation and test splits impact the performance of the models trained on this data. If the synthetic data generation depends on templates, as is the predominant approach for this task, there may be a leakage of information via a shared basis of templates across data splits if the partitioning is not performed hygienically. This paper investigates the extent of such information leakage across data splits, and the ability of trained models to generalize to test data when the leakage is controlled. We find that information leakage indeed occurs and that it affects performance. At the same time, the trained models do generalize to test data under the sanitized partitioning presented here. Importantly, these findings extend beyond the particular flavor of question answering task we studied and raise a series of difficult questions around template-based synthetic data generation that will necessitate additional research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.