“…LFM [38] proposes to use a dual-branch network to predict the segmentation of the foreground and the background, and then fuses the predictions to generate the final alpha matte. Srivastava et al [44] use an encoder-decoder network to directly predict the alpha matte. BSHM [43] employs three networks for segmentation, refinement, and matting processes, which enhances the estimated alpha mattes.…”