We introduce GNeRF, a framework to marry Generative Adversarial Networks (GAN) with Neural Radiance Field reconstruction for the complex scenarios with unknown and even randomly initialized camera poses. Recent NeRFbased advances have gained popularity for remarkable realistic novel view synthesis. However, most of them heavily rely on accurate camera poses estimation, while few recent methods can only optimize the unknown camera poses in roughly forward-facing scenes with relatively short camera trajectories and require rough camera poses initialization. Differently, our GNeRF only utilizes randomly initialized poses for complex outsidein scenarios. We propose a novel two-phases end-toend framework. The first phase takes the use of GANs into the new realm for coarse camera poses and radiance fields jointly optimization, while the second phase refines them with additional photometric loss. We overcome local minima using a hybrid and iterative optimization scheme. Extensive experiments on a variety of synthetic and natural scenes demonstrate the effectiveness of GNeRF. More impressively, our approach outperforms the baselines favorably in those scenes with repeated patterns or even low textures that are regarded as extremely challenging before.
Text plays an important role in daily life because of its rich information, thus automatic text detection in natural scenes has many attractive applications. However, detecting and recognising such text is always a challenging problem. In this study, the authors propose a method which extends the widely-used stroke width transform by two steps of edge analysis, namely candidate edge recombination and edge classification. A new method that recognises text through candidate edge recombination and candidate edge recognition is also proposed. In the step of candidate edge recombination, they use the idea of over-segmentation and region merging. To separate text edge from background, the edge of the input image is first divided into small segments. Then, neighbour edge segments are merged, if they have similar stroke width and colour. Through this step, each character is described by one candidate boundary. In the step of boundary classification, candidate boundaries are aggregated into text chains, followed by chain classification using character-based and chain-based features. To recognise text, the grey image is extracted based on the location of each candidate edge after the step of candidate edge recombination. Then, histogram of gradient features and a classifier are used to recognise each character. To evaluate the effectiveness of their method, the algorithm is run on the ICDAR competition dataset and Street View Text database. The experimental results show that the proposed method provides promising performance in comparison with the existing methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.