An Efficient Three-Dimensional Convolutional Neural Network for Inferring Physical Interaction Force from Video

Kim, Dongyi; Cho, Hyeon; Lim, Soo-Chul; Hwang, Wonjun

doi:10.3390/s19163579

Cited by 23 publications

(18 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…such as MobileNet [11] and ShuffleNet [36]. We believe that using this latest computation-efficient network architectures will accelerate the inference time of the proposed method and [14] is a good example.…”

Section: Resultsmentioning

confidence: 99%

Sequential Image-Based Attention Network for Inferring Force Estimation Without Haptic Sensor

Cho

Kim

et al. 2019

IEEE Access

Self Cite

View full text Add to dashboard Cite

Humans can approximately infer the force of interaction between objects using only visual information because we have learned it through experiences. Based on this idea, in this paper, we propose a method based on a recurrent convolutional neural network that uses sequential images to infer the interaction force without using a haptic sensor. To train and validate deep learning methods, we collected a large number of images and corresponding data concerning the interaction forces between objects shown therein through an electronic motor-based device. To focus on the changing appearances of a target object owing to external force in the images, we develop a sequential image-based attention module that learns a salient model from temporal dynamics for predicting unknown interaction forces. We propose a sequential image-based spatial attention module and a sequential image-based channel attention module, which are extended to exploit multiple images based on corresponding weighted average pooling layers. Extensive experimental results verified that the proposed method can successfully infer interaction forces in various conditions featuring different target materials, changes in illumination, and directions of external forces.

show abstract

Section: Resultsmentioning

confidence: 99%

Sequential Image-Based Attention Network for Inferring Force Estimation Without Haptic Sensor

Cho

Kim

et al. 2019

IEEE Access

Self Cite

View full text Add to dashboard Cite

show abstract

“…For example, 3D-CNNs are often used to recognize gestures or emotion in videos [ 35 , 36 , 49 ]. However, compared with approaches that combine CNN with RNN structures such as long short-term memory (LSTM) or gated recurrent units (GRU), 3D-CNN has a disadvantage that derives from its high computational complexity and excessive memory consumption, which can be a major burden for several applications that require high inference rates, especially on embedded devices [ 50 ]. Additionally, RNN architectures can be used to extract long-term temporal characteristics, whereas 3D-CNNs are mostly used for the extraction of short-term temporal pattern [ 51 ].…”

Section: Methodology and Background Knowledgementioning

confidence: 99%

A Spatio-Temporal Ensemble Deep Learning Architecture for Real-Time Defect Detection during Laser Welding on Low Power Embedded Computing Boards

Knaak

Eßen

Kröger

et al. 2021

Sensors

View full text Add to dashboard Cite

In modern production environments, advanced and intelligent process monitoring strategies are required to enable an unambiguous diagnosis of the process situation and thus of the final component quality. In addition, the ability to recognize the current state of product quality in real-time is an important prerequisite for autonomous and self-improving manufacturing systems. To address these needs, this study investigates a novel ensemble deep learning architecture based on convolutional neural networks (CNN), gated recurrent units (GRU) combined with high-performance classification algorithms such as k-nearest neighbors (kNN) and support vector machines (SVM). The architecture uses spatio-temporal features extracted from infrared image sequences to locate critical welding defects including lack of fusion (false friends), sagging, lack of penetration, and geometric deviations of the weld seam. In order to evaluate the proposed architecture, this study investigates a comprehensive scheme based on classical machine learning methods using manual feature extraction and state-of-the-art deep learning algorithms. Optimal hyperparameters for each algorithm are determined by an extensive grid search. Additional work is conducted to investigate the significance of various geometrical, statistical and spatio-temporal features extracted from the keyhole and weld pool regions. The proposed method is finally validated on previously unknown welding trials, achieving the highest detection rates and the most robust weld defect recognition among all classification methods investigated in this work. Ultimately, the ensemble deep neural network is implemented and optimized to operate on low-power embedded computing devices with low latency (1.1 ms), demonstrating sufficient performance for real-time applications.

show abstract

“…For example, 3D-CNNs are often used to recognize gestures or emotion in videos [33,34,44]. However, compared with approaches that combine CNN with RNN structures such as long-short term memory (LSTM) or gated recurrent units (GRU), 3D-CNN has a disadvantage that derives from its high computational complexity and excessive memory consumption, which can be a major burden for several applications that require high inference rates, especially on embedded devices [45]. Additionally, RNN architecture can be used to extract long-term temporal characteristics, whereas 3D-CNN are mostly used for the extraction of short-term temporal pattern [46].…”

Section: Recurrent Neural Network and Gated Recurrent Units (Gru)mentioning

confidence: 99%

Deep Learning and Conventional Machine Learning for Image-Based in-Situ Fault Detection During Laser Welding: A Comparative Study

Knaak¹,

Kröger²,

Schulze³

et al. 2021

Preprint

View full text Add to dashboard Cite

An effective process monitoring strategy is a requirement for meeting the challenges posed by increasingly complex products and manufacturing processes. To address these needs, this study investigates a comprehensive scheme based on classical machine learning methods, deep learning algorithms, and feature extraction and selection techniques. In a first step, a novel deep learning architecture based on convolutional neural networks (CNN) and gated recurrent units (GRU) is introduced to predict the local weld quality based on mid-wave infrared (MWIR) and near-infrared (NIR) image data. The developed technology is used to discover critical welding defects including lack of fusion (false friends), sagging and lack of penetration, and geometric deviations of the weld seam. Additional work is conducted to investigate the significance of various geometrical, statistical, and spatio-temporal features extracted from the keyhole and weld pool regions. Furthermore, the performance of the proposed deep learning architecture is compared to that of classical supervised machine learning algorithms, such as multi-layer perceptron (MLP), logistic regression (LogReg), support vector machines (SVM), decision trees (DT), random forest (RF) and k-Nearest Neighbors (kNN). Optimal hyperparameters for each algorithm are determined by an extensive grid search. Ultimately, the three best classification models are combined into an ensemble classifier that yields the highest detection rates and achieves the most robust estimation of welding defects among all classifiers studied, which is validated on previously unknown welding trials.

show abstract

An Efficient Three-Dimensional Convolutional Neural Network for Inferring Physical Interaction Force from Video

Cited by 23 publications

References 25 publications

Sequential Image-Based Attention Network for Inferring Force Estimation Without Haptic Sensor

Sequential Image-Based Attention Network for Inferring Force Estimation Without Haptic Sensor

A Spatio-Temporal Ensemble Deep Learning Architecture for Real-Time Defect Detection during Laser Welding on Low Power Embedded Computing Boards

Deep Learning and Conventional Machine Learning for Image-Based in-Situ Fault Detection During Laser Welding: A Comparative Study

Contact Info

Product

Resources

About