This paper studies the user collaboration experience with proactive and reactive agents in transporting boxes in virtual environments. Two main characters, the avatar and the agent, are controlled by a user and a controller, respectively. The user and the agent communicate with each other by voice. The agent can be proactive or reactive. The user follows the instruction issued by the proactive agent, whereas the user instructs the reactive agent to perform actions. The goal is to transport boxes to goal positions with orientation constraints. We conducted a user study to analyze the behaviors of participants in several aspects, including task completion time, path length, control experience, and co‐presence experience. We report our findings and make suggestions for future development.