“…Such architectures face difficulty adapting to challenging situations as their designs are complex enough and are created exclusively for specific challenging scenarios. Recent advancements in learning-based approaches lead to massive progress in motion detection [161,162,163,93,157,42,164,165]. SMSnet [161] is a method that leverages a convolutional neural network (CNN) and depends on two sequential camera images to perform pixel-wise category labeling and motion detection.…”