We propose an efficient and robust image‐space denoising method for noisy images generated by Monte Carlo ray tracing methods. Our method is based on two new concepts: virtual flash images and homogeneous pixels. Inspired by recent developments in flash photography, virtual flash images emulate photographs taken with a flash, to capture various features of rendered images without taking additional samples. Using a virtual flash image as an edge‐stopping function, our method can preserve image features that were not captured well only by existing edge‐stopping functions such as normals and depth values. While denoising each pixel, we consider only homogeneous pixels—pixels that are statistically equivalent to each other. This makes it possible to define a stochastic error bound of our method, and this bound goes to zero as the number of ray samples goes to infinity, irrespective of denoising parameters. To highlight the benefits of our method, we apply our method to two Monte Carlo ray tracing methods, photon mapping and path tracing, with various input scenes. We demonstrate that using virtual flash images and homogeneous pixels with a standard denoising method outperforms state‐of‐the‐art image‐space denoising methods.
Author name disambiguation (AND) algorithms identify a unique author entity record from all similar or same publication records in scholarly or similar databases. Typically, a clustering method is used that requires calculation of similarities between each possible record pair. However, the total number of pairs grows quadratically with the size of the author database making such clustering difficult for millions of records. One remedy is a blocking function that reduces the number of pairwise similarity calculations. Here, we introduce a new way of learning blocking schemes by using a conjunctive normal form (CNF) in contrast to the disjunctive normal form (DNF). We demonstrate on PubMed author records that CNF blocking reduces more pairs while preserving high pairs completeness compared to the previous methods that use a DNF and that the computation time is significantly reduced. In addition, we also show how to ensure that the method produces disjoint blocks so that much of the AND algorithm can be efficiently paralleled. Our CNF blocking method is tested on the entire PubMed database of 80 million author mentions and efficiently removes 82.17% of all author record pairs in 10 minutes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.