Previous developmental accounts of joint object activity identify a qualitative “shift” around 9–12 months. In a longitudinal study of 26 dyads, videos of joint object interactions at 4, 6, 9, and 12 months were coded for all targets of gaze and manual activity (at 10 Hz). At 12 months, infants distribute their sensorimotor modalities between objects handled by the parent and others controlled by the infant. Analyses reveal novel trajectories in distributed joint object activity across the 1st year. At 4 months, infants predominantly look at and manipulate a single object, typically held by their mothers. Between 6 and 9 months, infants increasingly decouple their visual and haptic modalities and distribute their attention between objects held by their mothers and by themselves. These previously unreported developments in the distribution of multimodal object activity might “bridge the gap” to coordinated joint activity between 6 and 12 months.
Active object recognition (AOR) refers to problems in which an agent interacts with the world and controls its sensor parameters to maximize the speed and accuracy with which it recognizes objects. A wide range of approaches have been developed to re-position sensors or change the environment so that the new inputs to the system become less ambiguous [1, 2] with respect to goals such as 3D reconstruction, localization or recognition of objects. Many of the active object recognition methods are built around a specific hardware system, which makes the replication of their results very difficult. Other systems use off-the-shelf computer vision datasets, which include several views of objects captured by systematically changing object's orientation in the image. However, these datasets do not offer any active object recognition benchmark per se.In this paper, we present and make publicly available the GERMS dataset (see Figure 1), that was specifically developed for active object recognition. The data collection procedure was motivated by the needs of the RUBI project, whose goal is to develop robots that interact with toddlers in early childhood education environments [4]. To collect data, we asked a set of human subjects to hand the GERM objects to RUBI in poses they considered natural. RUBI then pretends to examine the object by bringing it to its center of view and rotating the object. The background of the GERMS dataset was provided by a large screen TV displaying video scenes from the classroom in which RUBI operates, including toddlers and adults moving around. We also propose an architecture (DQL) for AOR based on deep Qlearning (see Figure 2). To our knowledge, this is the first work employing deep Q-learning for active object recognition. An image is first transformed into a set of features using a DCNN borrowed from [3] which was trained on ImageNet. We add a softmax layer on top of this model to recognize GERMS objects; the output of this softmax layer is the belief over different GERMS objects given an image. This belief is combined with the accumulated belief from the previous images using Naive Bayes. This accumulated belief represents the state of the AOR system in each time step.The accumulated belief is then transformed by the policy learning network into action values. This network is composed of two RectifiedLinear-Unit (ReLU) layers followed by a Linear-Unit (LU) layer. Each unit in the LU represents the action value for a given accumulated belief and one of the possible actions. In order to train this module, we implement the Q-learning iterative update:Figure 2: The proposed architecture for DQL.into the following stochastic gradient descent weight update rule for the network:Here, W is the weights of the policy learning network, Q(s, a) is the action-value learned by the network for action a in state s, γ is the rewarddiscount factor and R t is the reward at the tth time step. The number of output units in the policy learning network is equal to the number of possible actions. Each output uni...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.