“…The traditional methods rely on hand-crafted features. They mainly either simplify the small target as a bright spot [7], [8], or model the background, target and the relationship between them [9], [10] in a particular scene. Most traditional methods are designed with specific yet limited features, which failed to cover multiple scenarios and thus resulted in degraded performance in open and diverse environments.…”