Being able to effectively measure similarity between patents in a complex patent citation network is a crucial task in understanding patent relatedness. In the past, techniques such as text mining and keyword analysis have been applied for patent similarity calculation. The drawback of these approaches is that they depend on word choice and writing style of authors. Most existing graph-based approaches use common neighborbased measures, which only consider direct adjacency. In this work we propose new similarity measures for patents in a patent citation network using only the patent citation network structure. The proposed similarity measures leverage direct and indirect co-citation links between patents. A challenge is when some patents receive a large number of citations, thus are considered more similar to many other patents in the patent citation network. To overcome this challenge, we propose a normalization technique to account for the case where some pairs are ranked very similar to each other because they both are cited by many other patents. We validate our proposed similarity measures using US class codes for US patents and the well-known Jaccard similarity index. Experiments show that the proposed methods perform well when compared to the Jaccard similarity index.
Identifying the faulty variables of the out-of-control signal in high-dimensional process is an important problem for quality control areas. Even though there have been several procedures for fault variable identifications, most of the existing approaches assume the multivariate normal distribution of observations and are sensitive to the correlations between variables. Therefore, in this paper, we propose a new fault variable identification method that does not assume any specific distribution of observations. The proposed procedure based on one class classification method identifies the changed variables by identifying unchanged variables at each step using the information obtained from the previous steps. This strategy can reduce computational times when a few variables are changed in a high-dimensional process. In addition, the proposed procedure is robust to the correlations between variables, resulting in stable performance regardless of the number of changed variables. The experiment results with diverse dataset demonstrate superiority of the proposed distribution-free procedure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.