Three-dimensional (3D) stratified reconstruction under geometric transformations is of great significance in computer vision research. Certain applications involve projective or affine reconstruction; however, metric reconstruction best reflects the factual information of the object. The internal and external parameters of the camera play important roles in the stratified reconstruction. This study performed camera calibration using the geometric constraints of a scene in an image sequence or video stream. Consequently, 3D stratified reconstruction from the point cloud was performed according to the algebraic and geometric relations between projective, affine, and metric transformations. Delaunay triangulation and texture mapping were then used to restore the surface of the object in the scene. The results show that the object could be reconstructed both in disordered images and video streams and could be used to restore the 3D appearance of the object and obtain its corresponding geometric information. Finally, the You Only Look Once Version (YOLOV5) target detection algorithm was used to detect the metric reconstruction results, which demonstrated that the effect was satisfactory.