The heterogeneity in recently published knowledge graph embedding models' implementations, training, and evaluation has made fair and thorough comparisons difficult. To assess the reproducibility of previously published results, we re-implemented and evaluated 21 models in the PyKEEN software package. In this paper, we outline which results could be reproduced with their reported hyper-parameters, which could only be reproduced with alternate hyper-parameters, and which could not be reproduced at all, as well as provide insight as to why this might be the case.We then performed a large-scale benchmarking on four datasets with several thousands of experiments and 24,804 GPU hours of computation time. We present insights gained as to best practices, best configurations for each model, and where improvements could be made over previously published best configurations. Our results highlight that the combination of model architecture, training approach, loss function, and the explicit modeling of inverse relations is crucial for a model's performance and is not only determined by its architecture. We provide evidence that several architectures can obtain results competitive to the state of the art when configured carefully. We have made all code, experimental configurations, results, and analyses available at https://github. com/pykeen/pykeen and https://github.com/pykeen/benchmarking.
The radiation belts of the Earth, filled with energetic electrons, comprise complex and dynamic systems that pose a significant threat to satellite operation. While various models of electron flux both for low and relativistic energies have been developed, the behavior of medium energy (120-600 keV) electrons, especially in the MEO region, remains poorly quantified. At these energies, electrons are driven by both convective and diffusive transport, and their prediction usually requires sophisticated 4D modeling codes. In this paper, we present an alternative approach using the Light Gradient Boosting (LightGBM) machine learning algorithm. The Medium Energy electRon fLux In Earth's outer radiatioN belt (MERLIN) model takes as input the satellite position, a combination of geomagnetic indices and solar wind parameters including the time history of velocity, and does not use persistence. MERLIN is trained on >15 years of the GPS electron flux data and tested on more than 1.5 years of measurements. Tenfold cross validation yields that the model predicts the MEO radiation environment well, both in terms of dynamics and amplitudes o f flux. Evaluation on the test set shows high correlation between the predicted and observed electron flux (0.8) and low values of absolute error. The MERLIN model can have wide space weather applications, providing information for the scientific community in the form of radiation belts reconstructions, as well as industry for satellite mission design, nowcast of the MEO environment, and surface charging analysis. Plain Language Summary The radiation belts of the Earth, which are the zones of charged energetic particles trapped by the geomagnetic field, comprise complex and dynamic systems posing a significant threat to a variety of commercial and military satellites. While the inner belt is relatively stable, the outer belt is highly variable and depends substantially on solar activity; therefore, accurate and improved models of electron flux in the outer radiation belt are essential to understand the underlying physical processes. Although many models have been developed for the geostationary orbit and relativistic energies, prediction of electron flux in the 120-600 keV energy range still remains challenging. We present a data-driven model of the medium energies (120-600 keV) differentialelectron flux in the outer radiation belt based on machine learning. We use 17 years of electron observations by Global Positioning System (GPS) satellites. We set up a 3D model for flux prediction in terms of L-values, MLT, and magnetic latitude. The model gives reliable predictions of the radiation environment in the outer radiation belt and has wide space weather applications.
One of the major and unfortunately unforeseen sources of background for the current generation of X-ray telescopes are few tens to hundreds of keV (soft) protons concentrated by the mirrors. One such telescope is the European Space Agency’s (ESA) X-ray Multi-Mirror Mission (XMM-Newton). Its observing time lost due to background contamination is about 40%. This loss of observing time affects all the major broad science goals of this observatory, ranging from cosmology to astrophysics of neutron stars and black holes. The soft-proton background could dramatically impact future large X-ray missions such as the ESA planned Athena mission (http://www.the-athena-x-ray-observatory.eu/). Physical processes that trigger this background are still poorly understood. We use a machine learning (ML) approach to delineate related important parameters and to develop a model to predict the background contamination using 12 yr of XMM-Newton observations. As predictors we use the location of the satellite and solar and geomagnetic activity parameters. We revealed that the contamination is most strongly related to the distance in the southern direction, Z (XMM-Newton observations were in the southern hemisphere), the solar wind radial velocity, and the location on the magnetospheric magnetic field lines. We derived simple empirical models for the first two individual predictors and an ML model that utilizes an ensemble of the predictors (Extra-Trees Regressor) and gives better performance. Based on our analysis, future missions should minimize observations during times associated with high solar wind speed and avoid closed magnetic field lines, especially at the dusk flank region in the southern hemisphere.
In this work, we take a closer look at the evaluation of two families of methods for enriching information from knowledge graphs: Link Prediction and Entity Alignment. In the current experimental setting, multiple different scores are employed to assess different aspects of model performance. We analyze the informative value of these evaluation measures and identify several shortcomings. In particular, we demonstrate that all existing scores can hardly be used to compare results across different datasets. Moreover, this problem may also arise when comparing different train/test splits for the same dataset. We show that this leads to various problems in the interpretation of results, which may support misleading conclusions. Therefore, we propose a different evaluation and demonstrate empirically how this helps for fair, comparable and interpretable assessment of model performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.