We present a simplified, task-agnostic multimodal pre-training approach that can accept either video or text input, or both for a variety of end tasks. Existing pre-training are task-specific by adopting either a single crossmodal encoder that requires both modalities, limiting their use for retrieval-style end tasks or more complex multitask learning with two unimodal encoders, limiting early cross-modal fusion. We instead introduce new pretraining masking schemes that better mix across modalities (e.g. by forcing masks for text to predict the closest video embeddings) while also maintaining separability (e.g. unimodal predictions are sometimes required, without using all the input). Experimental results show strong performance across a wider range of tasks than any previous methods, often outperforming task-specific pre-training 1 .
Laser powder-bed fusion (L-PBF) is an additive manufacturing (AM) process that enables fabrication of functional metal parts with near-net-shape geometries. The drawback to L-PBF is its lack of dimensional precision and accuracy. The efficiency of powder fusion process in powder-bed AM processes is highly affected by process errors, powder irregularities as well as geometric factors. Formation of defects such as lack of fusion and over-fusion due to the aforementioned factors causes dimensional errors that significantly damage the precision.
This paper addresses the development of an automated in-situ inspection system for powder-bed additive manufacturing processes based on machine vision. The results of the in-situ automated inspection of dimensional accuracy allows for early identification of faulty parts or alternatively in-situ correction of geometric errors by taking appropriate corrective actions. In this inspection system, 2D optical images captured from each layer of the AM part during the build are analyzed and the geometric errors and defects impairing the dimensional accuracy are detected in each layer. To successfully detect geometric errors, fused geometric objects must be detected in the powder layer. Image processing algorithms are effectively designed to detect the geometric objects from images of low contrast captured during the build inside the chamber. The developed algorithms are implemented to a large number of test images and their performance and precision are evaluated quantitatively. The failure probabilities for the algorithms are also determined statistically.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.