At present, sports have become a frequent international competitive exchange activity. To a certain extent, a country's sports level also measures the country's comprehensive economic strength. In recent years, computer technology has injected new vitality into most industries, but its development in sports has been relatively slow. This paper mainly studies the basketball video image adaptive monitoring technology under the intelligent network. In this paper, the semantic information of events in basketball videos is extracted by extracting a type of characters and image information added by the video editor in the video screens in basketball videos, namely basketball video event monitoring. There is a direct correspondence between the non-scene target information in the basketball video and the semantics of the game event. Using this relationship, a low-complexity and effective adaptive monitoring method for basketball video images can be designed. Firstly, the basketball video is divided into two parts: video image data and video and audio data. The audio data part is analyzed to extract relevant audio features. Then the video image data part is divided according to the structure and divided into video frame images. There are three layers of shots and video events/scenes. Related image features are extracted for the video image layer, and middle-level semantic features are defined for the video shots. The algorithm uses the change characteristics of pixels in the time dimension that are more effective than traditional image area features to locate and recognize non-scene targets in the video, and then use the direct correspondence between non-scene target information and basketball video events to perform event. The experimental results show that the accuracy of the method in the experiment is 98.31%, and the recall rate is 99.09%. The algorithm proposed in this paper can obtain a higher accuracy rate of basketball video event monitoring than traditional algorithms.