Although previous studies have made some clear leap in learning latent dynamics from high‐dimensional representations, the performances in terms of accuracy and inference time of long‐term model prediction still need to be improved. In this study, a deep convolutional network based on the Koopman operator (CKNet) is proposed to model non‐linear systems with pixel‐level measurements for long‐term prediction. CKNet adopts an autoencoder network architecture, consisting of an encoder to generate latent states and a linear dynamical model (i.e., the Koopman operator) which evolves in the latent state space spanned by the encoder. The decoder is used to recover images from latent states. According to a multi‐step ahead prediction loss function, the system matrices for approximating the Koopman operator are trained synchronously with the autoencoder in a mini‐batch manner. In this manner, gradients can be synchronously transmitted to both the system matrices and the autoencoder to help the encoder self‐adaptively tune the latent state space in the training process, and the resulting model is time‐invariant in the latent space. Therefore, the proposed CKNet has the advantages of less inference time and high accuracy for long‐term prediction. Experiments are performed on OpenAI Gym and Mujoco environments, including two and four non‐linear forced dynamical systems with continuous action spaces. The experimental results show that CKNet has strong long‐term prediction capabilities with sufficient precision.
Symmetry is ubiquitous in everyday objects. Humans tend to grasp objects by recognizing the symmetric regions. In this paper, we investigate how symmetry could boost robotic grasp detection. To this end, we present a learning-based method for detecting grasp from single-view RGB-D images. The key insight is to explicitly incorporate symmetry estimation into grasp detection, improving the quality of the detected grasps. Specifically, we first introduce a new grasp parameterization in grasp detection for parallel grippers based on symmetry. Based on this representation, a symmetry-aware grasp detection network method is present to simultaneously estimate object symmetry and detect grasp. We find that the learning of grasp detection greatly benefits from symmetry estimation, improving the training efficiency and the grasp quality. Besides, to facilitate the cross-instance generality of grasping unseen objects, we propose Principal-directional scale-Invariant Feature Transformer (PIFT), a plug-and-play module, that allows spatial deformation of points during the feature aggregation. The module essentially learns feature invariance to anisotropic scaling along the shape principal directions. Extensive experiments demonstrate the effectiveness of the proposed method. In particular, it outperforms previous methods, achieving state-of-the-art performance in terms of grasp quality on GraspNet-1-Billion and success rate on a real robot grasping experiment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.