Weakly supervised referring expression grounding aims at localizing the referential object in an image according to the linguistic query, where the mapping between the referential object and query is unknown in the training stage. To address this problem, we propose a novel end-to-end adaptive reconstruction network (ARN). It builds the correspondence between image region proposal and query in an adaptive manner: adaptive grounding and collaborative reconstruction. Specifically, we first extract the subject, location and context features to represent the proposals and the query respectively. Then, we design the adaptive grounding module to compute the matching score between each proposal and query by a hierarchical attention model. Finally, based on attention score and proposal features, we reconstruct the input query with a collaborative loss of language reconstruction loss, adaptive reconstruction loss, and attribute classification loss. This adaptive mechanism helps our model to alleviate the variance of different referring expressions. Experiments on four large-scale datasets show ARN outperforms existing state-of-the-art methods by a large margin. Qualitative results demonstrate that the proposed ARN can better handle the situation where multiple objects of a particular category situated together 1 .
The knitting needle cylinder is one of the core parts of a hosiery machine. The operation of its needles can directly affect the production quality and efficiency of the hosiery machine. To reduce the production loss of a hosiery machine caused by knitting needle faults, a knitting needle fault detection system for hosiery machines based on a synergistic combination of laser detection and machine vision is proposed in this paper. When the system was operating normally, a photoelectric detector collected the laser signal reflected by the knitting needle and the system monitored the operation of the knitting needle using the ratio of adjacent peak-to-peak distances of the signals. When a fault signal was detected, the hosiery machine was stopped by the system immediately, and a charge-coupled device camera was used to take an image of the faulty knitting needle. After image preprocessing, the faulty knitting needle could be identified quickly and accurately using an image region size classifier based on a decision tree. The experimental results showed that a single image classification by the classifier could be performed in as little as 0.002 s.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.