Automatic evaluation of text generation tasks (e.g. machine translation, text summarization, image captioning and video description) usually relies heavily on task-specific metrics, such as BLEU (Papineni et al., 2002) and ROUGE (Lin, 2004). They, however, are abstract numbers and are not perfectly aligned with human assessment. This suggests inspecting detailed examples as a complement to identify system error patterns. In this paper, we present VizSeq, a visual analysis toolkit for instance-level and corpus-level system evaluation on a wide variety of text generation tasks. It supports multimodal sources and multiple text references, providing visualization in Jupyter notebook or a web app interface. It can be used locally or deployed onto public servers for centralized data hosting and benchmarking. It covers most common n-gram based metrics accelerated with multiprocessing, and also provides latest embedding-based metrics such as BERTScore (Zhang et al., 2019).
Most applications of Bayesian Inference for parameter estimation and model selection in astrophysics involve the use of Monte Carlo techniques such as Markov Chain Monte Carlo (MCMC) and nested sampling. However, these techniques are time-consuming and their convergence to the posterior could be difficult to determine. In this study, we advocate variational inference as an alternative to solve the above problems, and demonstrate its usefulness for parameter estimation and model selection in astrophysics. Variational inference converts the inference problem into an optimisation problem by approximating the posterior from a known family of distributions and using Kullback–Leibler divergence to characterise the difference. It takes advantage of fast optimisation techniques, which make it ideal to deal with large datasets and makes it trivial to parallelise on a multicore platform. We also derive a new approximate evidence estimation based on variational posterior, and importance sampling technique called posterior-weighted importance sampling for the calculation of evidence, which is useful to perform Bayesian model selection. As a proof of principle, we apply variational inference to five different problems in astrophysics, where Monte Carlo techniques were previously used. These include assessment of significance of annual modulation in the COSINE-100 dark matter experiment, measuring exoplanet orbital parameters from radial velocity data, tests of periodicities in measurements of Newton’s constant G, assessing the significance of a turnover in the spectral lag data of GRB 160625B, and estimating the mass of a galaxy cluster using weak gravitational lensing. We find that variational inference is much faster than MCMC and nested sampling techniques for most of these problems while providing competitive results. All our analysis codes have been made publicly available.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.