We present the development of and experiment with a robot system showing cognitive capabilities of children of three to four years. We focus on two topics: assembly by two hands and understanding human instructions in natural language as a precondition for assembly systems being perceived by humans as "intelligent". A typical application of such a system is interactive assembly. A human communicator sharing a view of the assembly scenario with the robot instructs the latter by speaking to it in the same way that he would communicate with a child. His instructions can be under-specified, incomplete and/or context-dependent. After introducing the general purpose of our project, we present the hardware and software components of our robots necessary for interactive assembly tasks. The control architecture of the robot system with two stationary robot arms is discussed. We then describe the functionalities of the instruction understanding, planning and execution levels. The implementations of a layered-learning methodology, memories and monitoring functions are briefly introduced. Finally, we outline a list of future research topics for extending our system.