A 3D colon model is an essential component of a computer-aided diagnosis (CAD) system in colonoscopy to assist surgeons in visualization, and surgical planning and training. This research is thus aimed at developing the ability to construct a 3D colon model from endoscopic videos (or images). This paper summarizes our ongoing research in automated model building in colonoscopy. We have developed the mathematical formulations and algorithms for modeling static, localized 3D anatomic structures within a colon that can be rendered from multiple novel view points for close scrutiny and precise dimensioning. This ability is useful for the scenario when a surgeon notices some abnormal tissue growth and wants a close inspection and precise dimensioning. Our modeling system uses only video images and follows a well-established computer-vision paradigm for image-based modeling. We extract prominent features from images and establish their correspondences across multiple images by continuous tracking and discrete matching. We then use these feature correspondences to infer the camera's movement. The camera motion parameters allow us to rectify images into a standard stereo configuration and calculate pixel movements (disparity) in these images. The inferred disparity is then used to recover 3D surface depth. The inferred 3D depth, together with texture information recorded in images, allow us to construct a 3D model with both structure and appearance information that can be rendered from multiple novel view points.
The ability to detect and match features across multiple views of a scene is a crucial first step in many computer vision algorithms for dynamic scene analysis. State-of-the-art methods such as SIFT and SURF perform successfully when applied to typical images taken by a digital camera or camcorder. However, these methods often fail to generate an acceptable number of features when applied to medical images, because such images usually contain large homogeneous regions with little color and intensity variation. As a result, tasks like image registration and 3D structure recovery become difficult or impossible in the medical domain. This paper presents a scale, rotation and color/illumination invariant feature detector and descriptor for medical applications. The method incorporates elements of SIFT and SURF while optimizing their performance on medical data. Based on experiments with various types of medical images, we combined, adjusted, and built on methods and parameter settings employed in both algorithms. An approximate Hessian based detector is used to locate scale invariant keypoints and a dominant orientation is assigned to each keypoint using a gradient orientation histogram, providing rotation invariance. Finally, keypoints are described with an orientation-normalized distribution of gradient responses at the assigned scale, and the feature vector is normalized for contrast invariance. Experiments show that the algorithm detects and matches far more features than SIFT and SURF on medical images, with similar error levels.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.