Kuan Fang scite author profile

Many robotic applications require the agent to perform long-horizon tasks in partially observable environments. In such applications, decision making at any step can depend on observations received far in the past. Hence, being able to properly memorize and utilize the long-term history is crucial. In this work, we propose a novel memorybased policy, named Scene Memory Transformer (SMT). The proposed policy embeds and adds each observation to a memory and uses the attention mechanism to exploit spatio-temporal dependencies. This model is generic and can be efficiently trained with reinforcement learning over long episodes. On a range of visual navigation tasks, SMT demonstrates superior performance to existing reactive and memory-based policies by a margin.

show abstract

Learning task-oriented grasping for tool manipulation from simulated self-supervision

Fang

Zhu

Garg

et al. 2019

The International Journal of Robotics Research

155

114

View full text Add to dashboard Cite

Tool manipulation is vital for facilitating robots to complete challenging task goals. It requires reasoning about the desired effect of the task and thus properly grasping and manipulating the tool to achieve the task. Task-agnostic grasping optimizes for grasp robustness while ignoring crucial taskspecific constraints. In this paper, we propose the Task-Oriented Grasping Network (TOG-Net) to jointly optimize both taskoriented grasping of a tool and the manipulation policy for that tool. The training process of the model is based on largescale simulated self-supervision with procedurally generated tool objects. We perform both simulated and real-world experiments on two tool-based manipulation tasks: sweeping and hammering. Our model achieves overall 71.1% task success rate for sweeping and 80.0% task success rate for hammering. Supplementary material is available at: bit.ly/task-oriented-grasp.

show abstract

DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes

et al. 2016

View full text Add to dashboard Cite

Recurrent Autoregressive Networks for Online Multi-object Tracking

et al. 2018

View full text Add to dashboard Cite

The main challenge of online multi-object tracking is to reliably associate object trajectories with detections in each video frame based on their tracking history. In this work, we propose the Recurrent Autoregressive Network (RAN), a temporal generative modeling framework to characterize the appearance and motion dynamics of multiple objects over time. The RAN couples an external memory and an internal memory. The external memory explicitly stores previous inputs of each trajectory in a time window, while the internal memory learns to summarize long-term tracking history and associate detections by processing the external memory. We conduct experiments on the MOT 2015 and 2016 datasets to demonstrate the robustness of our tracking method in highly crowded and occluded scenes. Our method achieves top-ranked results on the two benchmarks. Recurrent CellRecurrent Cell

show abstract

Demo2Vec: Reasoning Object Affordances from Online Videos

Fang

Yang

et al. 2018

102

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kuan Fang

Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks

Learning task-oriented grasping for tool manipulation from simulated self-supervision

DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes

Recurrent Autoregressive Networks for Online Multi-object Tracking

Demo2Vec: Reasoning Object Affordances from Online Videos

Contact Info

Product

Resources

About