Automatic and accurate instance segmentation of teeth can provide important support for computer-aided orthodontic work. Traditional methods for tooth segmentation studies often ignore the rich structural features of teeth. Capturing the complete and accurate geometry as well as morphological details of a single tooth remains a challenge for current tooth segmentation studies. In this article, a new tooth segmentation deeplearning network based on capturing dependencies and receptive field adjustment in cone beam computed tomography (CBCT) is proposed to achieve automatic and accurate instance segmentation of dental CBCT data. The method acquires coarse-level features of tooth and accurate tooth centroids in the first stage, and acquires the instance information and spatial position localization of the tooth. The encoding process in the second stage of the network introduces a guidance module for obtaining tooth geometry information based on a 3D self-attention mechanism to capture dependencies in CBCT. The proposed tooth feature integration module is based on multiscale fusion of dilated convolutions to capture tooth detailed information at multiple scales, and the network receptive field was adjusted. Extensive evaluation, ablation, and comparison experiments demonstrate that our method exhibits state-of-the-art segmentation performance and accurate instance segmentation results, reflecting their potential applicability in clinical medicine.