The graph construction procedure essentially determines the potentials of those graph-oriented learning algorithms for image analysis. In this paper, we propose a process to build the so-called directed l1-graph, in which the vertices involve all the samples and the ingoing edge weights to each vertex describe its l1-norm driven reconstruction from the remaining samples and the noise. Then, a series of new algorithms for various machine learning tasks, e.g., data clustering, subspace learning, and semi-supervised learning, are derived upon the l1-graphs. Compared with the conventional k-nearest-neighbor graph and epsilon-ball graph, the l1-graph possesses the advantages: (1) greater robustness to data noise, (2) automatic sparsity, and (3) adaptive neighborhood for individual datum. Extensive experiments on three real-world datasets show the consistent superiority of l1-graph over those classic graphs in data clustering, subspace learning, and semi-supervised learning tasks.
We present the largest database for visual kinship recognition, Families In the Wild (FIW), with over 13,000 family photos of 1,000 family trees with 4-to-38 members. It took only a small team to build FIW with efficient labeling tools and work-flow. To extend FIW, we further improved upon this process with a novel semi-automatic labeling scheme that used annotated faces and unlabeled text metadata to discover labels, which were then used, along with existing FIW data, for the proposed clustering algorithm that generated label proposals for all newly added data-both processes are shared and compared in depth, showing great savings in time and human input required. Essentially, the clustering algorithm proposed is semi-supervised and uses labeled data to produce more accurate clusters. We statistically compare FIW to related datasets, which unarguably shows enormous gains in overall size and amount of information encapsulated in the labels. We benchmark two tasks, kinship verification and family classification, at scales incomparably larger than ever before. Pre-trained CNN models fine-tuned on FIW outscores other conventional methods and achieved state-of-the art on the renowned KinWild datasets. We also measure human performance on kinship recognition and compare to a fine-tuned CNN.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.