Within this work, we explore intention inference for user actions in the context of a handheld robot setup. Handheld robots share the shape and properties of handheld tools while being able to process task information and aid manipulation. Here, we propose an intention prediction model to enhance cooperative task solving. Within a block copy task, we collect eye gaze data using a robot-mounted remote eye tracker which is used to create a profile of visual attention for task-relevant objects in the workspace scene. These profiles are used to make predictions about user actions i.e. which block will be picked up next and where it will be placed. Our results show that our proposed model can predict user actions well in advance with an accuracy of 87.94% (500 ms prior) for picking and 93.25% (1500 ms prior) for placing actions.